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Abstract 



A quantum channel is a mapping which sends density matrices to density 
matrices. The estimation of quantum channels is of great importance to the 
field of quantum information. In this thesis two topics related to estima- 
tion of quantum channels are investigated. The first of these is the upper 
bound of Sarovar and Milburn (2006) on the Fisher information obtainable 
by measuring the output of a channel. Two questions raised by Sarovar and 
Milburn about their bound are answered. A Riemannian metric on the space 
of quantum states is introduced, related to the construction of the Sarovar 
and Milburn bound. Its properties are characterized. 

The second topic investigated is the estimation of unitary channels. The 
situation is considered in which an experimenter has several non-identical 
unitary channels that have the same parameter. It is shown that it is possible 
to improve estimation using the channels together, analogous to the case of 
identical unitary channels. Also, a new method of phase estimation is given 
based on a method sketched by Kitaev (1996). Unlike other phase estimation 
procedures which perform similarly, this procedure requires only very basic 
experimental resources. 



Chapter 1 

Mathematical Background 



1.1 Overview 



This thesis is concerned with estimation of quantum channels. Almost ev- 
ery protocol in quantum information uses quantum channels. They are 
used in important protocols such as teleportation, Deutsch's algorithm, the 



Grover search algorithm and the Shor factorization algorithm (Le Bellac 



2006, Chapters 5 and 7). In theory it is assumed that a channel is known 
precisely, yet in practice this will not generally be the case. Thus the es- 
timation of quantum channels is of fundamental importance to the field of 
quantum information. 

Chapter [T] contains the mathematical and quantum-theoretic background 
needed to understand the thesis. Chapters 2] and [3] are concerned with the 
upper bound of Sarovar and Milburn (2006) on the Fisher information ob- 



tainable by measuring the output of a channel. Chapters [4] and 5 consider 
estimation of unitary channels. 

In Chapter[T]definitions are given of fundamental objects such as quantum 
systems, quantum states, quantum measurements and combined systems. 
Quantum channels are defined and quantum channel estimation introduced. 
A brief historical background is given of the key developments in channel 
estimation. Chapter [T] contains also a few small, new results. 

Chapter |2| considers work by Sarovar and Milburn (2006), who introduced 



an upper bound on the Fisher information obtained by measuring the output 
states of quantum channels. They showed that for certain channels, called 
quasi-classical channels, their bound is attainable. They asked (i) whether 
their bound is attainable more generally; (ii) whether or not it is possible to 
find an explicit expression for measurements attaining this bound. Both of 
these questions are answered in Chapter [2| 
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In the process of answering the previous questions, Chapter [2] shows that 
Sarovar and Milburn's work leads to a new Riemannian metric on the space of 
quantum states. Chapter [3] considers the questions: What are the properties 
of this new metric? Is it well defined? 

Chapters 4 and 5 are concerned with the cost of estimation of unitary 
channels. It is known that when there are n identical copies of a unitary 
channel, there exists (Kahn, 2007) an estimation procedure such that the 
cost function (a function of the expected fidelity, see (1.90)) is 0(l/n 2 ), 
instead of the usual 0(l/n). Chapter [4] considers the question: If there are 
n unitary channels which are not identical, but have the same parameter, is 
an analogous speed-up possible? 



Kitaev ( 1996 ) sketched an iterative method for phase estimation such that 
the cost function is 0((logn/n) 2 ). This method requires only a single copy 
of a unitary channel and basic measurements. In Chapter [5] it is shown that 
several attempts to give a detailed method for iterative phase estimation have 
been unsuccessful. There have been other successful iterative methods, but 
these require an extra rotation gate capable of performing arbitrary rotations 
with almost perfect accuracy. Thus Chapter [5] seeks to answer the question: 
Does a complete iterative phase estimation method exist which requires only 
a single copy of the unitary and basic measurements? 



1.2 System 

A quantum system is a physical system that obeys the laws of quantum me- 
chanics. The state of a quantum system (or quantum state, or quantum state 
of a system) is a quantification of the system, which, if known, allows an 
experimenter to make accurate predictions about the results of any future 



measurements on that system (Gill, 2001). Since measurement results are 



probabilistic, knowledge of a quantum state means that, given any mea- 
surement, it is possible to work out the long-term relative frequency of the 
observed outcomes. 

A quantum system is represented by a complex Hilbert space % of di- 
mension d, with a Hermitian inner product. The dimension d is given by the 
maximum number of distinguishable states in the system. For the spin of an 
electron, or the polarisation of a photon, H = C 2 . This is because there are 
only two distinguishable states: spin up and spin down. Any other quantum 
state can be represented as a complex linear combination of these two states. 

In this thesis only finite dimensional complex vector spaces are considered. 
Any column vector in a complex vector space is denoted by the symbol ip 
is a label, while |-) denotes that the object is a complex column vector. This 
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representation of complex vectors is called Dirac notation. Given a vector 



^2 



its dual (^l is defined as (with ip* denoting the complex conjugate of ipj) 

M = (1.2) 

An inner product is a bilinear map that maps a pair of complex vectors to 
a complex number. Given the vectors \ip) and \<f>), the inner product between 
these vectors is denoted by (^10)- There are many different inner products. 
In this thesis, only the following inner product will be used 



i=l 



/0i \ 

02 



V 0n / 



1.3) 



;i-4) 



The norm of a vector j^) can be defined as ||-0|| = \J(ip\ip). Vectors 
with norm equal to one are defined as unit vectors. Vectors and |0) are 
orthogonal if their inner product is zero. 

Given the vectors \ip) and |0), the outer product \4>){ip\ is given by 



/ 0i \ 



V 0« / 

/ 01^1* 
02^1* 



(1.5) 



01^2 
02^2 



\ 0„Vl 0„^ 



0iC \ 

02C 



0nC / 



1.6) 
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In Dirac notation (tp\<fr) represents a complex number, and ^(^l a ma- 
trix. 

A set of vectors \ui), . . . , \u n ) is orthonormal if the vectors are normal- 
ized and orthogonal, i.e. (ui\uj) = 5ij. Given a set of orthonormal vectors 
\ui), . . . , \u n ) in a vector space V, such that n = dim V, this set of vectors 
forms an orthonormal basis of V. Any vector \v) in V can be written as a 
scalar multiple of these vectors, i.e. 

n 

\v) = y )vi\ui), Vi = (ui\v)eC. (1.7) 

i=l 

Lemma 1.1 Given an orthonormal basis \ui), . . . , \u n ) for a vector space V , 

n 

J2\^)(ui\=I n . (1.8) 



i=l 



This is called the completeness relation (Nielsen and Chuang, 2000, p. 67). 



Proof. If \ui), . . . ,\u n ) is an orthonormal basis for V, then any \v) G V 
can be written as \v) = J2i v i\ u i)i where Vi = (ui\v). Now, 

f n \ n 

^2\ui)(ui\) \v) =^2vi\ui) = \v). (1.9) 

\i=l J i=l 

Since this holds for all \v), the result follows. 

The Hermitian transpose of a matrix A, denoted by A', is the matrix 
found by taking the transpose of A and replacing each entry with its complex 
conjugate ([A^jj = [A]^J. A Hermitian matrix (also called a self-adjoint 
matrix) is a matrix which is equal to its Hermitian transpose, i.e. B is 



Hermitian if = B. The Pauli matrices, given in (1.15), are examples of 
Hermitian matrices. 

Any matrix B that is Hermitian can be diagonalized, that is written in 
the form UDU^ , where D is a diagonal matrix and U is a unitary matrix, or 
equivalently in terms of its eigenvalues {g^} and eigenvectors as 

B = ' s ^2 i a i \wi){w i \. 



1.3 States 

Pure states will now be introduced. These form a subset of the set of all 
quantum states. A pure state of dimension d can be represented by a d- 
dimensional complex unit vector For real 6, the vectors and e ie \ip) 
represent the same state. 
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More generally, a d dimensional quantum state is represented by a d x d 
matrix p, also called a density matrix. This is a linear operator which acts 
on a complex Hilbert space H, is non- negative (v T pv > for all v G K d ) and 
has trace 1. A consequence of being non-negative is that p is self-adjoint. 
The set of states in a complex Hilbert space Tl will be denoted by S(H). 

A pure state can be referred to either by its state vector \ip), or by its 
density matrix p = For example, 

I*> = ^(1). *>=hWl = ($ 1/2 )' (L10) 

States which are not pure (have rank greater than one) are called mixed 
states. A simple test for whether a state p is pure or mixed is to take the 
trace of p 2 . For pure states tr{p 2 } = 1; for mixed states tr{p 2 } < 1. 

Examples of 2-dimensional mixed states are 

/ 3/4 \ / 1/2 -1/6 \ 

pl= { 1/4 J ' ^ 2 = ^ -1/6 1/2 J ' ( LU ) 

A mixed state can be expressed as a mixture of pure states in many 



different ways. For instance, the state pi, given in (1.11), can be written as 



, , , 1 \ . ,, / 
Pi = 3/4 ( Q Q J+l/4( ! 

1/2 ( 3 /t ^ 4 ) + i /2 f 3 /i. -^/M 



v/3/4 1/4 1/4 ; 

= 1/4 T V2 V2 ^ + 1/4 f 1/2 - 1/2 )+l/2( 1 ° 

1/4 V 1/2 1/2 y 7 V - 1 / 2 V2 / 7 V 

The set of 2-dimensional states, which is of great importance to the theory 
of quantum information, will now be investigated. A 2-dimensional quantum 
state is called a qubit. This is because qubits are the quantum analogue of 
'bits' (binary digits). In quantum information qubits are used to store and 
transmit information. Pure qubits are often expressed in the basis 

|o> = (J), ii> = (;V 



The state \tp), given in (1.10), can be expressed as (|0) + \l))/y/2. 

Any 2-dimensional quantum state can be written, with specific values of 

x, y and z, as 

P = 1/2 f l t Z X ~ iy ), (1.13) 
1 \ x + iy I — z I v ' 
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where x 2 + y 2 + z 2 < 1. The set of pure states corresponds to those states 
for which x 2 + y 2 + z 2 = 1. Any 2-dimensional state can be thought of as 
being a point with Cartesian co-ordinates (x, y, z) contained within a unit 
ball, known as the Block ball or Poincare ball. The points on the surface of 
the ball correspond to pure states; the points within the ball to mixed states. 



Alternatively, the state (1.13) can be written in terms of the identity and 
Pauli matrices 



I 



0~n 



1 

1 



1 

1 

- 

1 C 



as 



p= -(I + xa x + ya y + za z ), x,y,z<E 



;i.l4) 



;i.i5) 



;i.i6) 



1.4 Measurements 



In this section measurements are introduced. When a state is measured, a 
single result, m, is observed out of a set Q of possible outcomes. Associated 
with each outcome m is a matrix M m . The set of matrices {M m } constitute 
a measurement. The following conditions are imposed on M m 

Ml = M m , M m >0, ^M m = I. (1.17) 

A measurement of this form is called a Positive Operator Valued Measure, or 
POVM for short. 

Given a state p and a measurement M = {M m }, the result m is observed 
with probability given by the Born rule 



p{m) = tr{pM m }. 



:i.i8i 



The p(m) defined in (1.18) satisfies 
(i) p(m) > 0, as p > and M m > 0, 



as 



tr{pM m } 
tr{p ^ M m } 



tr{pl} 
1. 



6 



More generally, (Busch et ai, 1995, p. 23) one can consider a non-empty 



set Q and a sigma algebra J 7 , which is a collection of subsets of elements 
of Q that obeys certain rules. Together Q and T give, what is known as, 
a measured space (Q, J 7 ). A POVM over a measured space (ft, J 7 ) is a set 
{M^^j-AiG-F °f operators on H such that 



M(Ai) 
M{\JiAi) 

Af(fl) 



> 0, for all A* e J*, 

= y^M(A), for disjoint Aj, 



I. 



;i.i9) 
;i.2o) 

1.21) 



Applying the measurement M to a state p yields outcome i with probability 

p(i) = tr{pM(A 4 )}. (1.22) 

The most commonly used measurements are Projection Valued Measures 
(abbreviated to PVMs). A PVM is a POVM with elements, usually written 
as P m , which satisfy P m P m ' = 8 mm 'P m . (These measurements are also called 
projective measurements. 

M, a Hermitian operator on % (iNielsen and Chuang 



A PVM {Pm} is associated with an observable 

p. 87). 



2000 



The 



observable has spectral decomposition 



M =^ mP r> 



A simple example of a 2-dimensional observable is a x 

1/2 1/2 \ / 1/2 -1/2 



(J, 



1/2 1/2 y v 1 \ -1/2 1/2 
Measuring this observable corresponds to using the PVM 



M x = (M , 1 — M ), M 



1/2 1/2 
1/2 1/2 



;i.23) 



The term 'measuring in x' refers to using the PVM M x (for which M = 
(I + a x )/2). Similarly, the term 'measuring in y' refers to using the PVM 
M y = (M , 1 - M ), with M = (I + <T y )/2, and 'measuring in z 1 to using the 
PVM M z = (M , 1 - M ), with M = (I + a z ) /2. 

A POVM gives only information about data from a measurement. To 
describe how a state is changed by a measurement it is necessary to use 



instruments. More information on instruments is given in Barndorff-Nielsen 



et al. (2003). 
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1.5 Combined systems 



The tensor product is a mathematical operation which can be used to combine 
vector spaces to form a larger vector space. Given two vector spaces V and 
W, it is possible to combine them to form the vector space V <g> W, with 
dim(y (g) W) — dimV x dimVT. Given vectors \v) G V and \w) G W, the 
vector \v) <8> \w) 6 W. The vector (g) is computed from \v) and 
in the following way 



v 2 



\v m J 



^2 



\W n J 



ViW 2 
ViW n 

V V mWn J 



Similarly, given two matrices 



A = 



A n A 12 
A 2 i A22 



B = 



Bn B 12 
B21 B22 



their tensor product is equal to 



A®B 



A X1 B A 12 B 
A 2 \B A 22 B 

( AxyBxx AnB 12 AuB 



•i.) j \\>I> .2 \ 

\ A 2 \B 2 \ A 2 \B 22 A 22 B 2 \ A 22 B 22 J 



A\\B 2 \ A\\B 22 Ai 2 B 2 i A\ 2 B 22 
A 21 B n A 21 B 12 A 22 B n A 22 B 12 



More generally for an M x N matrix A and a P x Q matrix B, A <g> B is an 
MP x iVQ matrix with entries 



A®B 



A M \B 

( A ±1 B U 
A\\B 2 \ 



■ A 1N B 

■ AmnB 

A\\B 12 
A\\B 22 



Bpi Au\Bp 2 



Ml-DPl 



AinBiq \ 
A\nB 2 q 

AmnBpq J 
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The following rules hold for tensor products 



(A®B)\v)®\w) = A\v)®B\w) 

A®(B + C) = A®B + A®C 

{A®B){C®D) = (AC ® BD) 

tr(A®B) = (trA)(trB) 

(A®B) ] = A^®B ] . 

Given two different quantum systems represented by Hilbert spaces %a and 
T-Lb, the combined system is represented by the tensor product of these 
Hilbert spaces, i.e. Ha <8>^b, which will be labelled Ha,b- 

If p A G S(Ha) and p B G S(Hb), then the composite state in S(Ha,b) is 
pA,B _ pA (g, pB ^ Qj ven two pure states represented by \ip A ) and \ip B ), the 
composite system is in state represented by \ip A,B ) = \ip A ) ® \ip B ). For the 
sake of brevity the sign ® will usually be omitted, and the composite state 
written as \ip A ip B ). Of special interest are states, especially pure ones, which 
exist in the composite system but cannot be written in the form \ip A ip B ). 
Examples of such states are the Bell states. 



= 


^(|00) + |11» 


(1.24) 


10-) = 


^(|oo) -|ii)) 


(1.25) 


= 


^(|01> + |10» 


(1.26) 


\r) = 


^(|01>-|10». 


(1.27) 



Definition 1.1 A pure state \ip A,B ) G Ha,b, which cannot be written as 
\ip A )®\ip B ) is said to be entangled. More generally, we can consider entangled 
mixed states. A state p G S(T-La,b) which cannot be written as a mixture of 
separable pure states \ip A ) <g> |Vf ) G U A . b, i.e. as 

i 

is said to be an entangled state. 

Definition 1.2 States that are not entangled are said to be separable. 
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1.5.1 Partial trace 



The partial trace is an important operation when considering combined sys- 
tems. Given a density matrix p AB 6 S(Ha,b), the state of system %a is 
found by taking the partial trace over TLb, 

p A = tr B {p AB }, (1.28) 

where the partial trace trg is defined as 

tr B {\<Pi)(H®\i>i)(H} = \<h)(<hMM(H} 

A state found by taking the partial trace of a larger state on a combined 
system is known as a reduced state. 

The density matrix for the Bell state \<fi + ), given in (1.24), is 

P = |0 + >(0 + | = i (|oo)(oo| + |oo)(ii| + |ii)(oo| + |ii)(ii|) . 

Taking the partial trace over T~Lb gives 

PA = tr B (p) = ^(|0)(0|tr(|0)(0|) + |l)(0|tr(|l)(0|) 

+ |0)(l|tr(|0)(l|) + |l)(l|tr(|l)(l|) 

= ^(|o)(o| + |i)(i|). 

Note that, although the composite state is pure, the reduced state is mixed. 
This is one of the interesting properties of entanglement. 
If p AB = p®a then 

p A = tr B {p <g) a} = ptr{a} = p, 

as would be expected. Similarly, p B = tr a{p ab } = 



1.6 Measurements on several copies of a state 

Given a product state of the form p® n = p^ <g) • • • <g> p( n \ where p^> de- 
notes the jth copy of p, there are several types of measurements that can 
be performed. In this section the most common types of measurements will 
be defined: collective measurements, separable measurements, LOCC, adap- 
tive measurements and separate measurements. This section is similar to 



(Ballester, 2005, Section 1.2.6) 
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1.6.1 Collective measurements 



This is the most general type of measurement. If dim(p) = d then p® n acts 
on C d '\ Collective measurements are POVMs whose elements are d n x d n 



(1.17 


), ( 


Massar 


2000, 


Gill 


2008) 



1.6.2 Separable measurements 

These form a smaller class of measurements. Separable measurements are 
POVMs whose elements can be expressed as 



i=i 



(1.29) 



The elements M m must also satisfy (1.17). These measurements do not have 
a clear physical meaning. 



1.6.3 LOCC 

Another class of measurements are Local Operations and Classical Commu- 



nication (LOCC) (Nielsen and Chuang, 2000, p. 573). These form a smaller 
class of measurements than separable measurements as there exist separable 



measurements that are not LOCC (Bennett et ai, 1999). Unlike separable 



measurements LOCC has a clear physical meaning. Consider the situation 
in which there are n experimenters, each with a single copy of p. Each exper- 
imenter can only measure his own copy of p but is allowed to communicate 
with the other experimenters. LOCC measurements are easier to perform 
than some collective measurements, though the latter may often lead to a far 
more accurate estimate. 



1.6.4 Adaptive measurements 

In the field of quantum statistical inference one often comes across the term 



adaptive measurements. Measurements of this type were introduced by Na- 



gaoka (1988, 1989) because the 'optimal' measurements on a single quantum 



state often depend on the unknown state itself. (Nagaoka (1989) is included 



in QHayashi[ |2005| pp. 125-132). See also |Barndorff-Nielsen and Giflj pOOOfr 
for a discussion of the adaptive measurement strategy.) This dilemma of 
the optimal estimation strategy depending on the unknown parameter was 



described by Cochran (1973) as 'You tell me the value of the parameter 9 



and I promise to design the best experiment for estimating 
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In an adaptable measurement procedure, n! measurements (with n' small) 
are performed on the first n! copies of p to get a rough estimate p of the 
state. Next the POVM which is optimal for used on p® n n . An 

adaptive measurement may be collective but not separable, or separable but 
not LOCC, or simply LOCC, depending on what the optimal measurment is 
for 

An example is now given of an adaptive measurement. (The following 
procedure is LOCC.) Consider the state 

cos 2 (#/2) sin(0/2)cos(0/2)e-^ 
sin(0/2)cos(0/2)e^ sin 2 (#/2) 

where 9 is known. The case 9 = n/2 is special, as there exists an optimal 
POVM that does not depend on 0. When 9 ^ vr/2, the optimal POVM 
depends on 0, and one such POVM is 

M=(M ,I-M ), M = l -( 1 ^ ~ % \^ 



If p<§, is measured in x (see after (1.23)), outcome is observed with prob 



ability p(0; 0) = (1 + sin#cos0)/2, and 1 with probability p(l; 0) = (1 — 
sin6 l cos0)/2. Put A" = n'/2 and let N X=Q be the number of times that out- 
come is observed when p^ is measured A" times in x. This gives an estimate 
N x= q/N of p(0; <j)), and since 9 is known, an estimate of cos0. 

If ptf, is measured in y, outcome is observed with probability p(0; 0) = 
(l + sin^ sin 0)/2, and outcome 1 with probability p(l; 0) = (1 — SU16 1 sin0)/2. 
Put A" = n'/2 and let N y=0 be the number of times that outcome is observed 
when is measured A" times in y. This gives an estimate N y=Q /N of p(0; 0), 
and since 9 is known, an estimate of sin 0. 

Using estimates of cos0 and sin0, an estimate of is obtained. The 
'optimal' POVM, 



M=(M ,I-M ), M 



i<j> ^ 



is used on the remaining n—n' copies of p^. Using this measurement, outcome 

is observed with probability p(0; 0) = (1 + sin 9 sin(0 — 0))/2, and outcome 

1 with probability p(l; 0) = (1 — sin0sin(0 — 0))/2. Provided that — G 
[— 7r/2, 7r/2], the estimate p(0; 0) of p(0; 0) can be used to get a more accurate 
estimate 0' of 0, namely 



7 = + arcsin 



2 p(O;0)- l 

sin 9 
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It has been shown by Fujiwara (2006) that, for an adaptive quantum esti- 
mation scheme, the sequence of maximum likelihood estimators is strongly 
consistent and asymptotically efficient. 



1.6.5 Separate measurements 

These form the smallest class of measurements. A separate measurement 
is LOCC with no communication. That is, there are n experimenters, each 
with a copy of p, and no communication is allowed between them. 



1.7 Quantum Channels 



A quantum channel is a trace-preserving completely-positive map (TP-CP 
map) sending density matrices to density matrices. It can be thought of as 
the quantum analogue of a stochastic mapping. A mapping J 7 is positive if 
for all A > 0, J 7 (A) > 0. A mapping T is completely positive if for all positive 
integers k and B > 0, (1^ (g) J~){B) is positive (Nielsen and Chuang, 2000, p. 
367). 

The mathematical formalism for a quantum channel is originally due to 
Choi (1975). He showed that a linear map $ from M. n to M. m (Ai m is the 



set ofmxm complex matrices) is completely positive if and only if it can be 
written in the form $(A) = E^AE\ where are mxn matrices. For the 
map $ to be a quantum channel, it is further required that the mapping is 
trace-preserving. The map $ is trace-preserving if and only if E k E l E k =I n- 
Such a set of matrices E = {Ej,} are known as a set of Kraus operators. 
Th us, any quantum channel can be represented using Kraus operators 



E k as (Kraus 



2006) 



1983 



Nielsen and Chuang 



2000 



Bengtsson and Zyczkowski 



Po 



^ J2 E kPoEl 



;i.3o) 



where 



E l E k 



;i.3i) 



The form (1.30) for a general quantum channel can be derived as follows. 
Consider the composite state formed by the input state p Q G S(H) and the 
environment p env G S{% env ). Put 



P = PO®Pe 
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Suppose that p undergoes unitary evolution, i.e. 

p h-> UpU\ 

where WU = UW = I. It is assumed that p env is a pure state, with p env = 
|0)(0|, where |0), |1), . . . , \d — 1), form a basis of H e nv (This is the only place 
in this thesis where |0) does not refer to (1,0) T .) It is found that p has 
undergone the following transformation 

po ^ tT env {U(p ® loxoDc/t} 
= 5>|tf|0)po(0|l7t|*) 

A: 

= ^E fcPo ^, E k = (k\U\0). 
k 

Some examples will now be given of parametric families of quantum chan- 
nels. A unitary channel is a mapping which transforms a state p G S(C d ) to 
the state UpW G S(C d ), where U is a ti x d complex unitary matrix. Chapter 
[5] considers the problem of estimating the parameter 9 in a unitary channel 
acting on 7i = C 2 , with unitary matrix 

u 9=(l JL ) ■ (1-32) 



If this channel acts on the state p, given in (1.10), it produces the output 
state 

I e -i2ir9 



UepU ] e = 1/2 



Examples of non-unitary channels (channels with at least two non-zero Kraus 
operators Ef.) are now considered. 

The family of depolarizing channels S : S(C d ) — > S(<C d ) is the set of 
mappings (Nielsen and Chuang, 2000, p. 378) 

po ^ (1 - e)po + ^h, 0<e<l. (1.33) 

A depolarizing channel describes the process in which with probability 1 — e 
the state is left unchanged, and with probability e is replaced by the com- 
pletely mixed state Ij/d. 

The family of 2-dimensional depolarizing channels S : S(C 2 ) — > S(C 2 ) 
have Kraus operators (Nielsen and Chuang, 2000, p. 397) 



J ^ = \l 1 - jh, Et = J^a x , E 2 = J^a y , E 3 = J^a z 
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This family of channels forms a subset of the set of Pauli channels. The 
family of Pauli channels £ : 5*(C 2 ) — > S(C 2 ), indexed by the p arameters 



(p ,Pi,P2,P3)j with pj > and ^2 i=0 Pj = 1, is the set of channels (Fujiwara 



and Imai, 2003) 



PiO-iPQCTi 



(1.34) 



i=0 



where o"o = I2 and <J\,oi and 03 are the Pauli matrices (see (1.15)). 

The family of generalized Pauli channels £ : S(C d ) — > S(C d ), indexed by 

the parameters (po,Pi,P2, ■ ■ ■ ,Pd 2 -i), with pj > and Y^j=o Pj = 1> * s the 
set of channels (Fujiwara and Imail 2003) 



d 2 -i 



;i.35) 



fc=0 



The choice of unitary matrices Uj is arbitrary. 

The family of amplitude damping channels £ : 5*(C 2 ) — > S(C 2 ), indexed 



by the parameter 7, is the set of channels with Kraus operators (Nielsen and 



Chuang[ [2000| p. 380) 
£0 = 



1 





Ex 





v/7 



(1.36) 



This channel describes energy dissipation: every state is brought closer to 
the pure state |0)(0|. 



The family of generalized damping channels £ : S(C ) — > S(C ) (Nielsen 



and Chuang, 2000, p. 382), indexed by the parameters 7,p is the set of 
channels with Kraus operators 



E 
E2 



7 1 

1 











'i -pi v , 1 t: 

The parameter p £ [0, 1] represents the temperature of the environment. 



1.8 Fisher information 
1.8.1 One-parameter case 

Given a univariate family of probability distributions with probability density 



functions p(x;6), the Fisher information, introduced by Fisher (1922), is 
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defined as 



Fg = \ p(x;9) 
1 



d\np(x; 



I p(x; 0) ( 



89 

dp(x; 9 
89 



dx 



dx. 



;i.38) 
;i.39) 



Intuitively, Fisher information gives a measure of the amount of 'information' 
about 9 contained in an observation. If the random variable X is discrete with 
probabilities p(l; 9), . . . , p(n; 9), then the Fisher information can be expressed 
as 



E 



1 



dp(m; 9) 



' p(m;9) \ d9 

Proposition 1.1 The Fisher information from n i.i.d. observations X 1 ,X 2 , 
. . . X n is equal to nFg where Fg is the Fisher information from a single ob- 
servation Xj. 

Proof. The Fisher information for a single observation Xj can be written as 



Fg = —E 



d 2 l(9;x) 
d9 2 



;i.40) 



where 1(9; x) = log L{9; x) is the log-likelihood (the natural logarithm of the 
likelihood function). For the case of n observations, 



Thus 



L(9;x 1 ,...,x n ) = Y[p(xi;9). 



1(9; x 1: . . . , x n ) = ^ lo gP(^; 0), 



and the Fisher information from n observations is equal to 



F {n) = 



-E 



nFg. 



E 



d 2 l(9; Xi 
d9 2 



(1.41) 
(1.42) 

(1.43) 
(1.44) 



The importance of Fisher information is seen in the Cramer-Rao inequal- 
ity. This states that the mean square error of an unbiased estimator t is 
greater than or equal to the reciprocal of the Fisher information, i.e. 

1 



E[(9-9) 2 } > 



(1.45) 
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The right hand side of (1.45) is known as the Cramer-Rao bound. Under 



mild regularity conditions for p(x; 9), using a maximum likelihood estimator, 



as the number of observations n — > oo (Van der Vaart, 1998, p. 63) 



9)^Af(0,F e 1 ), 



;i.46) 



and so, assuming the estimator is unbiased, 



nE[(9 - 9) 2 } 



;i.47) 



(The symbol ~> denotes convergence in distribution.) The larger the Fisher 
information, the more accurately the unknown parameter can be estimated. 
A standard approach to estimation of a parameter, in a known family of 
distributions, is to use the maximum likelihood estimator. Consequently the 



result (1.47) is of great importance: it enables the asymptotic behaviour of 



an estimate to be quantified. 



1.8.2 Multi-parameter case 

Given a p-parameter family of probability distribution with probability den- 
sity functions p(x; 9 1 , . . . , 9 P ), the Fisher information, is the p x p matrix Fg 
with entries 



(Ft 



e)jk 



p(x;9) 
1 

p(x; 9) 



dlnp(x; i 

W 
dp(x; 9) 
d9i 



dlnp(x; 

dp(x; 9) 
d9 k 



dx 



dx. 



The Cramer-Rao inequality becomes a matrix inequality. This states that 
the mean square error of an unbiased estimator t is greater than or equal to 
the inverse of the Fisher information, i.e. 



E[(9 - 9)(9 - 9) T ] > F t 



:i.48i 



This means that the matrix E[(8 -9)(9- 9) T ] - F e l is positive semi-definite, 
i.e. for all v G M p , 



v T {E[{9 -9){9- 9) T ] - Fg^v > 0. 



1.9 Quantum information 

Definition 1.3 A Riemannian metric on a manifold M. is a mathematical 
object that assigns smoothly to each point x of M., and each coordinate system 
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6 = (6 1 , . . . , 9 P ) round x, a positive semi-definite pxp matrix ge{x) such that, 
for another coordinate system <fi = . . . , cff), 

94>( x ) = 



de 



d6\ 



T 



;i.49) 



or, in terms of elements of g^(x) 

9<t>{x)ij = 



5>( 



d0< d# 



;i.5o) 



k.i 



It has been shown by Morozova and Cencov (1990) that, up to a constant 



factor, the Fisher information is the unique monotone Riemannian metric on 
6. Several types of quantum information have been suggested as quantum 
versions of Fisher information (Petz and Sudar 1996), defined from a para- 
metric family of states pg. As the Fisher information is a Riemannian metric, 
any quantum analogue of Fisher information should also be a Riemannian 
metric. The following properties are important when considering Rieman- 
nian metrics. 
Invar iance. 

Two parametric families of states pg and o~g are said to be equivalent (pg ~ ag) 



(Petz and Sudar, 1996) if there exist two fixed TP-CP maps £, T such that 



og 



Hpe)- 



The Riemannian metric J is said to be invariant ( |Petz and Sudar 1996) if 



p e ~ a e implies J(p e ) = J{a e ). 

Monotonicity. 

The Riemannian metric J is said to be monotone (Petz and Sudar, 1996) if 



Ape) > J(S(pe)) 

for all TP-CP maps S. 

A well-defined Riemannian metric must be invariant and it is desirable 
that it is monotone. (If a metric is monotone then it is also invariant.) It 



has been shown by Petz and Sudar (1996) that there is no unique monotone 



quantum information quantity. The most frequently encountered monotone 
metrics in recent literature are the Symmetric Logarithmic Derivative (SLD), 
Right Logarithmic Derivative (RLD) and Kubo-Mori-Bogoliubov (KMB) met- 



rics, which are defined in (1.51) - (1.54) and (1.59) - (1.62) 
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Given a one-parameter family of states p$, these quantum information 
quantities can be expressed as 



H* = tv{XlpX x }, (1.51) 

where the parameter 6 has been suppressed, and the quantum scores X x (the 
quantum analogues of the logarithmic derivative dl/dQ) are defined as the 
solutions to the following matrix equations 

~ = \(P\3LD + \SLDP) (1-52) 



dp 

de 

\kmb 



pXrld (1-53) 
cHog p 



de 



;i.54) 



These all satisfy 



tr{pX x } = 0. (1.55) 



The quantum scores Xrld and Xkmb are defined only when pg has full rank, 
i.e. when pg is invertible. When pg does not have full rank, Xsld is not 
defined uniquely, though H SLD does not depend on the choice of Xsld- 

The SLD quantum information is the most commonly used quantum in- 
formation quantity. It is the minimum among the set of monotone quantum 



information quantities (Petz and Sudar, 1996). In this thesis the SLD quan- 



tum information will be denoted simply by H or Hg, and the SLD quantum 
score by A or Xg. 

The SLD quantum information for n copies of the state pg (i.e. p & = 
Pe <E> pg ® ■ ■ ■ <8> pe) is n times the SLD quantum information for the state pg, 
that is 

H{p ( e n) )=nHg{pg). (1.56) 
To see this, note that in this case 

dp^ dp dp dp .„ 

-^7T = ^®P®---®P + P®^®P®---®P + P®---®P®^- (1-57) 

de de de de 

Let A be a possible solution for the SLD score for p. Then a possible 
solution for the SLD score for 

p{n) is 

A (n) = A<g>I®---<g>I + I®A®I®---<g>I+---+I<g>---®I®A. (1.58) 
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The SLD quantum information can be written as 



do 

tr{^ O pA ® • • • (g> p} 



+ tr{p <g> • • • ® p ® ^A}. 
Since tr{pA} = 0, the only non-zero terms are those of the form tr{p <g> 



p® (dp/d6)X <g)p - • • ® p}. As there are n terms of this form, (1.56) holds. 



1.9.1 Mult i- parameter models 

Given a p-parameter family of quantum states p(6 l , . . . ,0 P ), the quantum 
analogues of the logarithmic derivative (81/ 06^) are defined as the solutions 
to the following matrix equations 



dp 1 
0(1' ~ 2 



Wsld + (1-59) 



dp 



P^rld (1-60) 



dlogp , 
*kmb — ■ U-oiJ 

These all satisfy 

tr{pA^} = 0. 

The quantum informations are p x p matrices with entries 

= »tr{A?^}. (1.62) 

1.10 Braunstein- Caves inequality 

Consider the parametric statistical model resulting from a measurement M = 
{M m } of a parametric family of states pg. This has the probability function 

p(m-9) = tr{p e M m }, (1.63) 
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which gives Fisher information F g M 
inequality 

F Q M < H, 



Braunstein and Caves ( 1994 ) proved the 



;i.64) 



for the one-parameter case. They showed that, in the one-parameter case, 
the SLD quantum information gives the maximum Fisher information that 
can be obtained from measuring a model pg. 



1.10.1 Multi-parameter Braunstein- Caves inequality 

Theorem 1.1 Let Fg 1 be the Fisher information given by a measurement 
M on a parameterised quantum model {pg : 6 = 1 , . . . , 9 P G 1R P } ; with SLD 
quantum information H e . Then 



F a M < Eg. 



;i.65) 



This means that the matrix H e — Fg 1 is non-negative, which is equivalent to 



J2 x M F e M kk<Yl {H e )ik, 

j,k jk 



;i.66) 



for all vectors x = (x\, . . . , x p ) G W. 



Not only does (1.65) give an upper bound on the Fisher information, but the 
proof gives necessary and sufficient conditions for equality. The following 
proof is similar to that in (Ballester 2005 p. 26). 
Proof. 

Denote by Q + the set of outcomes which occur with non-zero probability. 
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Then 

^2xjX k (F^) jk 

3,k 



y> x j x k 



pirn; 6) \ dQi 



dpim; 6) \ f dp(m; 6) 
dd k 



E^_E HI**-}) ( tr 



E 3 * 3 * E tr r ^ i (»tr{A J >M m » (mr{A fc P M m }) 



< ^ayr* £ ^ \tr{VpM m }\\tr{\ k pM m }\ 

j,k m&Q+ m 



£ 

meO+ 



tr{pM m } 



Itr 



{(MV 2 Ap 1 / 2 ) ( MVy/ 2 )t}P 



= tr{M TO ApA} 

< tr{ApA} 
= ^^x fe tr{AVA fc } 

= y]xjXk{H e ) jk . 

Inequality (1.67) follows from the fact that J2 m en+ M m — ^> since ^2 m&n M m = 
I and M m > 0. 



1.67) 
1.68) 



Theorem 1.2 Equality holds in (1.65) if and only if 

MU*\'?l* = g n MWfl* t CGI Vj,m. 



(1.69) 



The following proof is similar to that in (Ballester, 2005, p. 27). 
Proof. 
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Equality holds in (1.65) if and only if the following three conditions are met 

Vj,m, (1.70) 
z m il4/y/ 2 , somez m eC, Vm, A, (1.71) 

(1.72) 



3tr{AVM m } 
tr{Af m ApA} = tr{ApA}, Vm, A. 



Since 



tr{ApA} = Y tr{M m ApA} + ^ tr{M m ApA}, 



equality holds in (1.65) if and only if the following three conditions are met 

3tr{A J pM m } = Vj,m, (1.73) 

MU 2 Vp 1/2 = zlMU 2 p l/ \ some^eC, Vj, m, (1.74) 

tr{M m \ j p\ k } = 0, Vj,k,m. (1.75) 

men\n+ 



Theorem 1.2 will be proved by showing that 



(i) if ( |1.69D holds , then ( |1.73[ ), ( |1.74[) and ( ]1.75[ ) hold, and thus equality 
holds in (1.65) (consequently (1.69) is a sufficient condition); 



(ii) the conditions (1.73) and (1.74) both hold only if (1.69) holds (thus 
(1.69) is a necessary condition). 



(i) If (1.69) holds, then (1.74) obviously holds. Pre-multiplying (1.69) by 



rl/2 
m j 

holds. Note that 



M^, post-multiplying by p 1 / 2 and taking the trace shows that (1.73) also 



tr{A f A} = if and only if A = 0. 



;i.76) 



FormeO\ Q + , p m = 0, and so, since p m = tr{(M™ /2 p 1 / 2 )* (M^ 2 p 1 / 2 )} , by 
jl.76) it follows that Mlj 2 P 1/2 = 0. If ( |l69| holds, then formefl\fl + , 
MU \ 3 p 1/2 = and so |l75] holds. 



(ii) First, it will be assumed that (1.74) holds. Pre-multiplying (1.74) by 



Mm , post-multiplying by p 1 / 2 and taking the trace gives 

tr{M m A J p} = z m p m , Vj,m. 



(1.77) 



For condition (1.73) to hold, z m must be real. Thus (1.73) and (1.74) both 
hold only if (1.69) holds. 
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1.10.2 Equality 



For one-parameter models, equality holds in (1.64) if and only if 



MUW 12 = UMU 2 P l/ \ Vm, Ue 

As A is self-adjoint, it can be written as 



:i.7£ 



The POVM M = {Mi = |e^> (e^} satisfies (1.78) and so, using this POVM, 



equality holds in (1.64). It has been shown by Barndorff-Nielsen and Gill 



(2000) that, in general, the optimal POVM will depend on the unknown 
parameter. To get around this, an adaptive measurement scheme can be 
used, as described in Section 11.6.41 There are a few families of states for 



which the optimal POVM does not depend on the parameter, such as the set 
of states corresponding to the 'equator' of the Bloch ball, given by the set of 
density matrices 



Pe = 1/2 



1 e 

j2we ^ 



0e [0,1), 



and sets of quasi-classical states. Quasi- classical states are defined as sets of 
states for which the eigenvectors are known, i.e. families of states of 

the form 



E 



Pi 



Wi)(Wi 



In the mult i- dimensional case there exist sets of states for which the bound 



( 1.65 ) is not attainable even using an adaptive scheme (Barndorff-Nielsen and 



Gill, 2000). In Section 4.2, it is shown that for any non-degenerate set of pure 
states (these can be parameterised by a maximum of 2(d — 1) parameters), 
equality holds in (1.65) only if the number of parameters p < d — 1. 



The fact that the SLD quantum information is not in general attain- 
able means that it cannot in general be used to find the optimal estimation 
method for quantum states. The problem of optimally estimating n identical 



quantum states has recently been solved by Gu^a et al. (2007), Kahn and 



Guta (2009). The solutions presented in these papers are based on quan- 



tum local asymptotic normality: given n copies of a state, as n — > oo the 
joint state converges to a statistical model consisting of a classical Gaussian 
distribution and a quantum Gaussian distribution. The optimal estimation 



procedure for these models is known, having been solved by Yuen and Lax 
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(1973), Holevo (1982). Quantum local asymptotic normality was first stud- 



ied in Hayashi ( 2003a[b ) and used for estimation in Hayashi and Matsumoto 



(2004). It was later made more rigorous by (Gu^a and Kahn, 2006, Gu^a and 



Jencova, 2007). 



The optimal estimation of qubits has been solved explicitly in the Bayesian 



set-up, in the particular case of an invariant prior in (Bagan et al, 2006). 



1.10.3 Equality in the case of pure states 



Putting together (1.48) and (1.65) gives the quantum Cramer-Rao inequality 



E[{6-6){6 



(1.79) 



A result of Matsumoto (1997) will now be considered. In the case of pure 
states, it gives a concise necessary and sufficient condition for the existence 
of a POVM such that equality holds in (1.79). 

Theorem 1.3 Let {pQ : 9 e B} be a parameterised family of pure states with 
Pe = \i J e)(i J e\- Then there exists a POVM and estimator such that equality 
holds in (1.79) at 9 = 9q, if and only if 

s(ij{e )\ik{e Q )) = o, vj,k, 



;i.so) 



where \lj(9)} = \l\ipe) (Matsumoto 



1997 



2002 



Fujiwara, 2002). 



In Section |4.2 it is shown that condition ( 1.80 ) is equivalent to the simpler 
condition 



%(^\9 )\^ k) (9 )) = 0, Vj,k, IV 00 ) 



893 



(1.81) 



When (|1.80p is satisfied, a POVM giving equality in (1.79) is given ex- 



plicitly by (Ballesterl |2004a| 



M„ 



M, 



p+2 

\b m ) 
\Vm) 



m 5 
p+1 

I - M m , 

m=l 

p+1 

^ ^ Omn | V n ) , 

n=l 



m — 1, 



;i.82) 
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with Oa(p+l)x(p + l) real orthogonal matrix satisfying O mjP+ i 7^ 0. That 
this POVM does indeed give equality in (1.79) can be seen from Lemma 9 of 



Fujiwara (2002). 



An original proof of the necessity part of Theorem 1.3 will now be given. 



Lemma 1.2 Condition (1.80) is a necessary condition for equality in (1.79). 



Proof. For equality in (1.79) it is necessary that equality holds in (1.65). 
Equality holds in (1.65) if and only if 



< /2 AV /2 = &MUV\ CeR Wk,m. 



;i.83) 



For pure states, (1.83) becomes 



Thus equality holds in (1.65) if and only if 



Mii 2 \k) = e m Mi 



/2| 



V/c, m. 



;i.84) 



Taking the transpose of (1.84) gives 

(lj\M 1 J 2 =?JmX 2 , Vj,m. (1.85) 

Pre-multiplying the left hand side of (1.84) by the left hand side of (1.85), 
and the right hand side of (1.84) by the right hand side of (1.85) gives the 
necessary condition 

(lj\M m \l k ) = CmC^Pm, Vj, k, m. 
Summing over m, and using the result M m = I, gives 



As CmCm an< ^ Pm are an rea ^ ^ follows that (lj\lk) is real and (1.80) is a 
necessary condition for equality in (1.65), and thus a necessary condition for 
equality in (1.79). 



That (1.80) is a sufficient condition for equality in (1.79), follows from 
Ballester's result that if (1.80) holds, then the POVM given in (1.82) gives 
equality in (1.65). 
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1.10.4 Attainable measurements 
case 



the 2-dimensional 



For quantum statistical models with H = C 2 , equality holds in the Braunstein- 
Caves inequality (1.64) only if every element of the POVM M = {M^} has 
rank 1. This was shown for pure states by Barndorff- Nielsen and Gill (2000), 
and for mixed states by Luati (2004). An original proof of this result, which 
includes mixed and pure state models, will now be given. A necessary con- 
dition for equality in ( |1.64[ ) is 

Mll 2 \ P 1 ' 2 = UMU 2 p 1, \ U e E, Vm. 

Pre-multiplying by Mm 2 and post-multiplying by p 1 / 2 gives 

M m Xp = £ m M m p, Vm. 

Then 



M m A 

r. 



0. 



Vm, 



where 



Now, from (1.86), it is seen that M m is singular unless A m 
will be assumed that A m = for all 6. If this is so then 



;i.86) 



for all 6. It 



(1.87) 



Taking the trace of (1.87) gives 



tr{Ap} 




£ m tr{p} 



Thus from (jL87j) and ([L88J), 

and so 



Ap = 0, 



Xp + (Xp) 



,dp 
'd6 



0. 



Thus, if A m = 0, the model does not depend on 6. Assuming that the 
model does depend on 9, it follows that A m ^ and so M m is singular. 
A consequence of this is that in the 2-dimensional case, the elements of 
attainable measurements have rank 1. 
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1.11 Estimation 



Many quantum information processes can be represented as quantum chan- 
nels. In practice, quantum channels are not known a priori and estimating 
them is of great importance. 

There are several ways to estimate a quantum channel. One approach 



is quantum process tomography, which is discussed in chapter 8 of |Nielsen 
and Chuang (2000). For this approach it is necessary to estimate how the 



channel acts on different bases of the Hilbert space plus linear combinations 
thereof. A problem with this method is that in many practical situations it 



is not possible to prepare these input states in the laboratory (de Martini 



et al, 2003). 



Another approach is to assume that the channel comes from a given 



parametric family of channels ( Fujiwara , 2001 , 2002 , 2004 , Fujiwara and Imai 



2003| |Ballester[ |2004a[b] |Sarovar and Milburn[ |2006p . (The latter approach 
will be followed in this thesis.) A family of channels parametrized by a real 
parameter 9 can be represented by Kraus operators depending on 9 as 



;i.89) 



When estimating a quantum channel, there are many different factors to 
consider: how should the channels be arranged, and what combination of 
input state, POVM and estimator is best. The idea of finding the optimal 



input state was considered by Acm et al. (2001). 



In general, for a parametric family of channels, different input states lead 
to different families of output states. The input state is chosen such that the 
family of output states has the maximum attainable SLD quantum informa- 



tion. The measurement which gives equality in (1.65) is chosen (an adaptive 



measurement may be needed), and the maximum likelihood estimator used. 

In this thesis the performance of an estimation procedure is usually mea- 
sured either by the mean square error E[(9 — 9) 2 ] or for unitary channels, 
where Us is the estimate of the unitary matrix Ug, by 



l-(F(U § ,U e )) 



d? 



;i.9o) 



where (•) denotes expectation. Often this cost function will be denoted simply 
by 1 — (F). Given a family of channels S(9), an estimate 9 of a parameter 9 
will depend on n, the number of times the channel S(9) is used. Similarly, an 
estimate U§ of a unitary matrix Ug will also depend on n. It is of interest to 
see how rapidly 9 approaches 9, and Us approaches Ug, as n — > oo. The 'big 
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O 1 notation will be used for this purpose. It is said that ' f(n) ls 0(g(n)y if 



there exist constants c and n such that for all n > n , f(n) < cg(n) (Nielsen 



and Chuang, 2000, p. 136). That is, for large n, up to an unimportant factor, 



the function g(n) is an upper bound on f(n) 



1.11.1 Important developments in channel estimation 

Here a brief review is given of the major advances in the estimation of quan- 
tum channels. 

A channel £ : S(C d ) >->■ S(C d ), can be extended to a channel I ® £ : 
S{C d2 ) S(C d2 ) by 



Pi^{l d ®£){ Pl ), P ieS(C d2 ) 



[1.91) 



For many channels £, when using (1.91), a maximally entangled input state 
is optimal, in terms of Fisher information. Often the Fisher information 
is significantly greater than can be obtained from the unextended channel 
£ : S(C d ) i — y S(C d ). This was shown for a completely unknown unitary 



matrix in SU(2) by Fujiwara (2002), and SU(d) (close to the identity) by 



Ballester (2004b). This has also been shown for several non-unitary channels, 



in particular the 2-dimensional depolarizing channel (Fujiwara, 2001) and 



more generally, the generalized Pauli channels (Fujiwara and Imai, 2003). 



Another advantage of the extended channel I <S> £ is that, using a maxi- 
mally entangled input state, the output states are in one-to-one correspon- 
dence with the channel. This means that, in contrast to quantum tomog- 
raphy, the experimenter does not require many different input states: it is 
enough to have many copies of a maximally entangled state. 



Using the extension (1.91) with a maximally entangled input state the 
mean square error and 1 — (F) are 0(l/n) (Hayashi, 2006a). This rate 



at which 1 — (F) approaches zero is known as the standard quantum limit 



2007, Imai and Fujiwara, 2007). 



(de Burgh and Bartlett, 2005), but can be surpassed (Hayashi, 2006a, Kahn 



Another major step in estimation, when n copies of a channel are avail- 
able, was the idea of using the following extension with an entangled input 
state, so that 

p 2 ^^ n (p 2 ), P 2eS(C dn ). (1.92) 



One of the first clear uses of this method for estimation was by Huelga et al. 



(1997). 



Using the experimental setup (1.92), it has been shown that it is possible 
to estimate a unitary matrix with 1 — (F) = 0(1 /n 2 ). This has been shown 



for estimation of an unknown unitary matrix in SU(2) by Hayashi (2006a) 
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and SU(d) by Kahn (2007). This rate at which 1 — (F) approaches zero 



is known as the Heisenberg limit ( Giovannetti et aXj |2004 ) and cannot be 
surpassed (Kahn, 2007). 



A reference frame is a specific coordinate system. Estimation of a unitary 
matrix in SU (2) is equivalent to the problem of transmiting a 3-dimensional 
reference frame from Alice to Bob. Alice encodes information about her refer- 
ence frame in quantum particles, and then sends these to Bob. Bob measures 
the quantum particles, and from his results estimates Alice's reference frame. 
It has been shown that it is possible to do this with 1 — (F) = 0(l/n 2 ) (Bagan 
eTaTj [2004a]bl |Chiribella et q/.[ |2004| ). 



For most channels it is not possible to surpass the standard quantum 



limit. This has been shown for generalized Pauli channels by Fuji war a and 



Imai (2003). Recently it has been shown (Fujiwara and Imai, 2008) that for 



most channels, given n copies and using the setup (1.92), the SLD quan- 
tum information is 0(n). A consequence of this is that, from the quantum 



Cramer- Rao inequality (1.79), for these channels, the mean square error is 
0(l/n). 

It is also possible to use a channel repeatedly on the same input state, 



i.e. 



po ^ £ n (po) 



;i.93) 



Kitaev ( 1996 ) suggested an /-stage iterative estimation scheme for the unitary 
matrix (1.32). At the fcth stage Ug acts 2 k ~ l times on the same input state. 



At each stage, several measurements are made. Using this information, an 
estimate 9 of 9 is obtained satisfying Pr(|0 — 6\\ < l/2 i+2 ) > 1 — e. The value 
of e can be made arbitrarily small by doing more measurements at each stage. 



For a similar estimation scheme, Rudolph and Grover (2003) showed that, 
by choosing e = 1/2 2Z , 1 — (F) = 0((\ogn/n) 2 ). The advantage of these 
estimation schemes is their simplicity: they require no entanglement and only 
a single copy of S. In spite of this, 1 — (F) is still close to the Heisenberg 
limit. 

This thesis contains, as far as the author is aware, the first complete 



method for iterative estimation similar to that of Kitaev (1996). It is also 



shown that an extension similar to (1.92) can be used to estimate n non- 



identical channels, with an entangled input state. This results in an increase 
in the rate at which the mean square error decreases, relative to using a 
separable state. 
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Chapter 2 



Attainability of the information 
bound of Sarovar and Milburn 



2.1 Introduction 



The problem of estimating non-unitary quantum channels is more difficult 
than that of estimating unitary channels. The output states of non-unitary 
channels are mixed, and the SLD quantum information is generally more 
cumbersome to compute. Also, for multi-parameter families of mixed states, 



there is no known analogue of Matsumoto's condition (1.80) for equality in 



the Quantum Cramer-Rao inequality (1.79); neither is there a known method 
for computing the optimal POVM. 



Sarovar and Milburn (2006) introduced an upper bound on the Fisher 



information obtained from measuring the output states of a parameterised 
family of channels. They also gave necessary and sufficient conditions for 
equality. Their bound depends on the Kraus operators of the channel and 
not on the set of output states. In this chapter it is shown that this bound 
is not generally attainable, and consequently does not generally give the 
optimal POVM. Thus the attempt of Sarovar and Milburn to find the optimal 
estimation strategy for non-unitary quantum channels is not succesful. (The 



work in this chapter has been published in O'Loan (2007).) 



The problem of how to express the SLD quantum information of a noisy 
channel in term s of its Kraus operators has recently been solved by |Fujiwara 
and Imai| ( 12008] for the extended channel l d ®8 : S(C d2 ) H> S(C d2 ). This puts 
an upper bound on the SLD quantum information for the unextended channel 
£ : S(C d ) i — y S(C d ), but this bound will not, in general, be attainable. 
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2.1.1 The approach of Sarovar and Milburn 

Sarovar and Milburn looked at estimating one-parameter quantum channels 
of the form 



(2.1) 



(see (1.30)). The input state p is a known pure state, and is chosen such 
that the output state is in one-to-one correspondence with the channel. Since 
a specific value of 9 corresponds to a specific channel, estimation of the chan- 
nel reduces to a parameter estimation problem. Sarovar and Milburn were 
interested in finding the maximal Fisher information that can be obtained by 
measuring the output states of the set of channels (2.1 ). They were also inter- 
ested in finding POVMs that attain this bound. First, Sarovar and Milburn 
derived the inequality 

F S M < C E {6). 



(2.2) 



where E denotes a set of Kraus operators {E^} and 



d 



C E (6) =4j2^{E , k (0)poE'^9)}, E' k (6) = -E k (6). (2.3) 

k 

However, it was noted that Ce{9) depends on the Kraus representation E 
(Sarovar and Milburn, 2006). For any channel S, the Kraus representation is 
not unique. Given a unitary matrix U = [ujk] then the set of operators {Fj} 
given by 

Fj = UjkEk, 

k 

lead to the same quantum channel ( |Nielsen and Chuang 2000, p. 372). That 
is, for all p , 

k j 

To obtain a bound which depends only on the channel and not on the Kraus 
representation, Sarovar and Milburn chose the bound given by the canonical 
Kraus operators. Canonical Kraus operators {Tk{0)} are defined as Kraus 
operators satisfying 

tr{T fc (0)A)T}(0)} = 5 jkPk {9), Vj, k. (2.4) 



From (2.2) it follows that 



Ft < C T (6). 



(2.5) 
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Remark 2.1 The can onical Kraus operators are unique only up to a choice 
of phase (see p. 267 of Bengtsson and Zyczkowski (2006)). In Chapter^it 



is shown that this leads to ambiguity in the bound Cy(9). However, this does 
not affect the results in this chapter. 



Throughout the rest of this chapter the right hand side of (2.5 ) will be referred 



to as the SM bound. The bound (2.5) is said to be uniformly attainable if 



for all 9 in B, there exists a POVM M, possibly depending on 9, such that 
Ff 1 = Cy{9). If this bound is not uniformly attainable, then no bound of the 



form (2.3) is uniformly attainable (Sarovar and Milburn, 2006). To achieve 



equality in (2.5) the POVM {M m } must satisfy 



MU 2 r k {9)Po' 2 = U0)M^T k (9)pl 



1/2 



Vm, k, 



(2.6) 



for some real £ m (#). (This condition is analogous to (1.69).) For channels 



with quasi-classical output states (see Section 1.10.2), it was shown in Sarovar 



and Milburn (2006) that this bound is attainable. Channels of this type will 



be called quasi- classical channels. Sarovar and Milburn asked 



i) whether their bound (2.5) is attainable more generally, 



(ii) whether explicit expressions for optimal POVMs can be derived from 



the attainability conditions (2.6). 



It is very important for an upper bound on Fisher information to be attain- 
able, otherwise it gives an unrealistic view of how well a parameter can be 
estimated. 



2.2 One-parameter channels 

In this Chapter the extended channel will be considered, i.e. 

p ^I d ®S(p ), Po eS(C d2 ). 



(2.7) 



The canonical Kruas operators {T k (9)} are d 2 x d 2 Kraus operators satisfying 



(2.4) 



When the input state is pure, with p = l^oXV'ol, condition (2.4) for the 
canonical Kraus decomposition is equivalent to the condition 



(Vj(9)\v k (e)) = 5 jk jp k (e), where \v k (9)} = T k (9)\^ ) . 
The output state is 

p» = Z>*(0)><«*(*)i- 



(2.8) 
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This can be rewritten as 



£> fc (0)K(0)>M0)|, \w k (9)) 



VpW) 



\v k {0)). 



(2.9) 



Thus the canonical decomposition leads to the spectral decomposition of the 
output state (Sarovar and Milburn, 2006). 



Proposition 2.1 The SM bound, Cx(9), can be expressed as (omitting 9) 



k,Pk^0 j<k, Pj +p k >0 

+ 4 E Pk\(w' k \w k )\ 2 . 



(2.10) 



Proof. For simplicity, it is assumed that for all pj(9) either 

(i) pjifi) > for all 9, 

(ii) pj(9) = for all 9. 



When pj(9) = for all 9, it follows from and that 

s/Pj\Wj) = 0, 



T,#o) = 

t;# ) = 0, 

tr{T> Tt'} = (^o|T]'T;# ) = 0. 



When pj{9) > 0, for all 9, 

T,#o> 



Then 



(^o|tJ.'t;# ) 



(Wj\ + yfiTjWj 



Pi 



72 



^T + | (KK> + K-K)) (2.ii) 
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The right hand side of (2.11) can be simplified, because 



d 

WK-) + ( w j\ w j) = QQ tT M = °> Pi = \ w M w i\- 



(It follows from (2.12) that (w'j\wj) is purely imaginary.) Thus 

-72 



G 



T 



4 £ 



p 



4p 



1 +vM\ w 'j) ) ■ 



Inserting the identity Id = Ylt=i \ w k)( w k\ into (w'Aw'j) gives 



^ = E?r + E %mi^)kio> 



/2 



E ;r + E %IK-K>i 2 - 



Pi 



The right hand side of (2.13) will be re- written in (2.18). Since 

(wj\w k ) = 5 jk , 



it follows that 

d_ 
89 



[Wj\w k ) 

IKK) I 2 



(w'jlwk) + (wyK) = 0, 

-{wj\w' k ), 

(w' j \w k )(w k \w' j ) 

(-K-K))(-KK)) = IKK)I 2 - 



Now, 



(2.12) 



(2.13) 



(2.14) 
(2.15) 



E %IK-K)f 



j2 4^m> fc )i 2 + ^ 4 Pj m> fe )i 2 

E 4 ^IKK)I 2 - ( 2 - 16 ) 



Swapping the indices j and & in the second term and using (2.15) simplifies 
( |2T6| further to 

£ 4 P ,m.K)| 2 = £ 4( Pj + Pfc )m.K)| 2 + £ 4^;k.>| 2 . 

(2.17) 
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Thus, from (2.13) and (2.17), the SM bound C?(9) can be rewritten as 



Pi 

j,Pj^0 rj j<k, Pj +p k >0 

+ Aj2Pk\K\w k }\ 2 . 



(2.18) 



Remark 2.2 It can be seen that Cr{9) can be described solely in terms of 
the family of output states. The SM bound was originally derived as an up- 
per bound on the Fisher information for a one-parameter family of quantum 
channels. Since any parametric family of quantum states can be written in 
the form 



Cy(9) can be extended to an upper bound on the Fisher information for one- 
parameter families of states. 



It can be seen from the form of (2. IS) that Cy(6) is a Riemannian metric 



on a 1-dimensional manifold (see Section 1.9) 



Proposition 2.2 The SLD quantum information can be written as (omitting 

0) 



k,Pk^0 Pk j<k,p 3 +p k >0 Pi + P k 



(2.19) 



Proof. The SLD is defined as any self-adjoint solution A of the matrix 
equation 

^ = i(pA + A„). (2.20) 
The SLD quantum information is defined as 

H = tr{p\ 2 }. 
Substituting into Q2.20P gives 



2 1^2pi\ w i)( w i\ x + X Y1 

\ l m 



r, 



(2.21) 
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From (2.21) the components of the SLD are calculated. First, the diago- 
nal elements Xjj are considered. Pre-multiplying (2.21) by (wj\ and post- 
multiplying \wj) gives, on the left hand side, 

p'j +p j ((w j \w' j ) + (w'Awj)) = p'j 



by (2.12), and on the right hand side 

Pj(wj\X\wj). 



Hence, provided that pj > 0, 



Pi 
Pj' 



The diagonal elements Xjj are not defined when pj = 0. In this case, a 
particular solution of A is chosen for which Xjj = 0. Next, the off-diagonal 
components Xj k are considered. Pre-multiplying (2.21) by (wj\ and post- 
multiplying by \wk) gives, on the left hand side 

+ p k (wj\w' k ) +pj(w'j\w k ) = (pj - p k )(w'j\w k ), 



by (2.14), and on the right hand side 

1 



(Pj + Pk)(wj\X\w k ). 



Thus, provided that pj + pk > 0, 



A 



2(Pj ~ Pk)(w'j\w k ) 
Pj + Pk 



The entries Xj k are not defined when Pj+p k = 0. Again a particular solution 
of A is chosen for which Xjk = 0, when pj + p k = 0. This gives the following 
particular solution of the SLD 



A 



E-k>hi + E 2^^( W ;k>k>m- (2.22) 



Pk 



j¥=k,Pj+Pk>0 



Pj + Pk 



Denote by A 2 * the part of A 2 which makes a non-zero contribution to tr{pA 2 }. 
Only terms of the form Zk\w k }{wk\, with z k G C, in A 2 will contribute to 
tr{pA 2 }. Thus, 



= E if] ww+ E 



E 



Pk 
Pk 



j¥=k,Pj+p k >0 

\wk)(w k \+ E 4 

j^k,pj+p k >0 



A Pj Pk Pk Pj I /1 \ I /1 \ I 

4— (wAw k ) (wAwj) Iwj 

Pj+PkPk+Pj 3 3 3 



Pj ~ Pk 
Pj + Pk 



\{w'j\w k }\ 2 \Wj}{Wj\ 



37 



using (2.14). This gives 



k,p k ^O Fk j^k,Pj+Pk>0 



Pj ~ Pk 
Pj + Pk 



IK'K)P 



(2.23) 



The second term on the right hand side of (2.23) can be rewritten as 

2 / \ 2 

Pj ~ Pk 



E ^ 

j^k,pj+p k >0 



Pj ~ Pk 
Pj + Pk 



IK-MI 2 = E W 

j<k, P j+p k >0 ^ 

+ E 4 W ( 



Pi + Pk 

Pj - Pk 
Pj + Pk 



IKK)! 2 - 



k<j,Pj+Pk>0 

Swapping the indices, j and k, in the second term on the right hand side of 
the above equation and using (2.15) gives 

2 



E 4 *>i 

j^k,pj+p k >0 



Pi ~ Pk 
Pj + Pk 



IK-K)P 



E 4 



j<k,pj+p k >0 



{Pj-Pkf 
Pj + Pk 



The required result (2.19) follows from (2.23) and (2.24). 
Theorem 2.1 

Hq < Cy(9), 

with equality if and only if 

(w' j \w k ) = 0, Vj,k with pj,p k >0. 



\(w'j\w k )\ 2 . 
(2.24) 

(2.25) 
(2.26) 



Proof. The first terms in Hq, given in (2.19), and Cr{@), given in (2.10), 
are identical. Thus 



C?(6) -H e = A c {9) - A H {9) + B c {8), 



where (omitting 



E 4^-^£|( W > fc )i 2 , 

Pi + Pk 

j<k, Pj +p k >0 1 3 

E A iPj+Pk)\(Wj\Wk)\ 2 , 
j<k,pj+p k >0 

B c = Pk\(w' k \w k )\ 2 . 

k,Pk¥=0 



A H 
A c 
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The terms Ac and Ah are symmetric in j and k due to (2.15). Now 

(Pj+Pk) 2 - (Pj-Pk) 2 



A c -A 



H 



^ E 

j¥"k,Pj+Pk>0 



E 



j¥=k,Pj+Pk>0 



Pjpk 
Pj + Pk 



Pj + Pk 



Changing the range of the summation to j ^ k where Pj,Pk > 0, and adding 
B c gives 



CV — H 



E 



j,k,pj,p k >0 



PjPk 
Pj + Pk 



l«K>P 



(2.27) 



Since the right hand side of (2.27) is non-negative, (2.25) follows. 



Equality holds in (2.25) if and only if the right hand side of (2.27) is zero, 
which holds if and only if (2.26) holds. 

Lemma 2.1 For channels, with output states, for which Pj(9) > for all 
j and 9, the bound (2.25) is achievable if and only if the channel is quasi- 
classical. 



Proof. Equality holds in (2.25) if and only if (2.26) is satisfied. When 
Pj{9) > for all j and 9, condition (2.26) is satisfied if and only if \w'j) has 
zero components along every vector \Wk). This is possible only if \w' 3 ) = 
and hence the channel is quasi-classical. 



Lemma 2.2 For unitary channels, the bound (2.25) is achievable if and only 
if 

ti{U e p US} = 0. (2.28) 



Proof. Equality holds in (2.25) if and only if (2.26) is satisfied. For unitary 

Ue\ipo), where p = |-0o)(0o|- 
0. This is equivalent to 



channels there is only one non-zero pj and \wj) 
Condition (2.26) is satisfied if and only if (w'j\wj) 



(2.28) 



Remark 2.3 Note that, for the most common unitary channels - those of the 
form exp(i9H), with H a self-adjoint matrix - condition (2.28) is satisfied. 



Example 2.1 There exist channels which are neither quasi- classical or uni- 
tary for which equality holds in (2.25). The channel with an arbitray pure 
input state and output states 



9 2 \w 1 (9))(w 1 (9)\ 



l )\w 2 {9))(w 2 (9)\, 0<9<1, 
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where 



\w x {9)) = (6, VT^¥, Of, \w 2 (9)) = (0, 0, 1)' 



satisfies (2.26), and so equality holds in (2.25). 



Theorem 2.2 



with equality if and only if 



F e M < C r (9), 



(w' j \w k ) = 0, Vj,k with pj,p k >0. 



(2.29) 



(2.30) 



Proof. Inequality (2.29) follows from (1.64) and (2.25). Equality holds in 



(2.29) if and only if there is equality in both (1.64) and (2.25). For one- 



parameter families of states it is always possible to find a POVM Mg, de- 



pending on 9, which achieves equality in (1.64) (Braunstein and Caves, 1994). 



However, equality holds in (2.25) if and only if (2.30) is satisfied 



Theorem 2.3 



H e < C E (9), 



(2.31) 



with equality if and only if the set of output states satisfies (2.26), and a fixed 
unitary matrix U = [ujk] exists such that the Kraus operators Ej are related 
to the canonical Kraus operators by 



E 3 (9) = J2^r k (9). 



Proof. Inequality (2.31) will be proved by considering two cases: 



When equality is attainable in (2.2), it is attainable also in (2.5) (Sarovar 



and Milburn] |2006[ ). In this case, C r (9) < C E (9) for all other sets of 
Kraus operators E = {Ej} (Sarovar and Milburn, 2006). Inequality 
fl2~25|) gives H e < C E (9). 



< C E {9) for all M. For one-parameter 
families of states there always exists a measurement Me such that 
F? e = H e . Thus He = F™ e 



(ii) When ([272} is not attainable, F 9 M 

< C E {9) 



Equality holds in (2.31) only if the bound given by the canonical Kraus 
operators Cy is attainable. The bound Cr is attainable if and only if the 



set of ouput states satisfies (2.26). It has been shown (Nielsen and Chuang 
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2000, p. 372) that if two sets of Kraus operators {Ej} and {Fk} lead to the 
same quantum channel then they must be related by 



E 



'3 - ^2 U 3 kFk 



(2.32) 



where U = [v,jk] is a unitary matrix. When Cy is attainable (Sarovar and 
Milburn||2lX)6] ), 

C E = Cy+4j2pMk\ 2 - 

jk 



jjkPj\ U 'jk\ 2 



0. This 



Thus for equality in (2.31) it is further required that ^2 
is satisfied if and only if a unitary matrix U = [ujk] exists satisfying (2.32) 
that does not depend on 9. 



Remark 2.4 Condition (2.6) cannot be used generally to test for optimal- 
ity of POVMs. Condition (2.6) is a necessary and sufficient condition for 
equality between the Fisher information and the SM bound. Since it is not 
generally possible to achieve equality between the Fisher information and the 
SM bound, condition (2.6) cannot be achieved for general models. Thus it 



cannot be used generally to test for POVMs giving maximal Fisher informa- 
tion. 



2.3 Multi-parameter channels 



2.3.1 The multi-parameter SM bound 

The SM bound for a multi-parameter family of channels will be defined as 
the matrix Cy(9) with entries 



C r {6) jk =4^mr{T,(0)k'VoT,(0) (fe)t } , T,(0)( fc > = JjT,(0). (2.33) 



Proposition 2.3 For 9 and v in MP, and t — > 0, 

d -T k (9 + tv) = ^T fc (0)(V + O(t), 



dt 



(2.34) 
(2.35) 



where X t is defined by (2.22) with respect to the parameter t, Xg is defined 
by (2.22) with respect to the parameter 9 l and v is the Ith component of the 
vector v. 
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Proof. 

Put (f)(t) = 9 +tv, with components 4> l (t) = 8 l + tv 1 . Using the chain rule to 
differentiate T k (cj)(t)) gives 

d T (A( f) \ _ y dr fc (0) d<f> 1 



(2.36) 



Now, 



dT fe (0) «9T fc (0) 



t=o 



90' 



w 1 . 



Substituting these back into ( |2.36[ ) gives (2.34). Similarly, for t — > 0, 

p fc (0 + to) = p fc (0) + O(t), (2.37) 
dp k (9 + tv) sr^AD..i , ^\ (i)_dPk(9) 



dt 

d\w k (6 + tv)) 
eft 



X>?V + 0(f), p? 



(On _ 0M*)> 



00' 



(2.38) 



(2.39) 



Substituting §2l7\ - ( [2T39] ) into fl2~22| gives 

E,piV + o(t) 



E 



Pk + 0(t) 



\w k )(w k \ 



+ 



j^k,Pj+Pk 



(0 

£«M E -k>m + E 2 



Pfr 



j¥=k,Pj+p k >o 



Pj + Pk 



wy\w k ) \wj)(w k \ + 



Thus At has the form (2.35). 
Theorem 2.4 For multi-parameter channels, 

H e < C r (6), 

with equality if and only if 



w 



(0 



where 



w 



w k 
(0 



0, Vj, fe,/ with pj,p k >0, 



(2.40) 
(2.41) 



w , 
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Proof. Equation (2.40) is equivalent to 

v T H e v < v T C T (6)v, foraMveW. 



(2.42) 



For given 9 and v in W, consider the set of one-parameter channels 

po J2 T k(° + tv )poTi(9 + tv), tel. (2.43) 



From Theorem 2.1 it is known that H t < Cy(t), i.e. 

d 



i=i ^ 

Using ( ]2.34[ ) and (2.35) and evaluating at t — gives 

^"VHr {a^aH < 4 ^ t/Vtr {T ; (fl)( m VoT,(#) (n)t } • 



m,ra 



m.nl 



This is equivalent to (2.42). Since this holds for all v G W, (2.40) holds. 



Equality in (2.40) is equivalent to 

v T H e v = v T C r (6)v, 



(2.44) 



for all v E W p . It follows that (2.44) holds for all v G W if and only if, for 



one-parameter channels of the form (2.43) for given 9 and v G MP, H t \ t=0 



Cy(t)\ t=0 . From Theorem 2.1, this holds if and only if the channel (2.43) 

This condition is equal to 

0, V?, k with pj,pk>0. 



satisfies (2.26) at the point t = 0. This condition is equal to 

;(Wj\ ) \w k ) 

/ t=0 



dt 



Using (|2.39|), this condition can be rewritten as 

= Vj, k with Pj,Pk > 0. 



i=i 



(2.45) 



Condition (2.45) holds for all v if and only if (2.41) is satisfied. 



Lemma 2.3 For channels, with output states for which Pj{0) > for all j 
and 8, equality holds in (2.40) if and only if the channel is quasi-classical. 



Proof. This follows from (2.41) and the same analysis as in Lemma 2.1 
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Lemma 2.4 For unitary channels, equality holds in (2.40) if and only if 

dUo ] ' 



tr <^ U e po 



0, V/. 



Proof. This follows from (2.41) and the same analysis as in Lemma 2.2 



Example 2.2 There exist channels which are neither quasi- classical or uni- 



tary for which equality holds in (2. 40). The channel with an arbitrary pure 
input state and output states 



pe = f(9) 2 \w 1 (9))(w 1 (9)\ + (1 - f(6) 2 )\w 2 (6))(w 2 ( 
where f(6) and g(6) are real functions of 6 with < f(9),g(9) < 1 and 
\ Wl (e)) = (g(6), ^l-g{6)\ 0) T , \w 2 (9)) = (0, 0, if, 



satisfies (2.41) and hence achieves equality in (2.40). 



Theorem 2.5 For multi-parameter channels, 

F e M < C r (6), 



(2.46) 



with equality if and only if (2.41) holds and there exists a POVM satisfying 
MU 2 \°p Xt2 = ^ m M]l 2 p x l\ f ro el, Vj,m. (2.47) 



Proof. This follows from Theorems 1.1, 1.2 and 2.4 



Theorem 2.6 For multi-parameter channels, 

H e < C E {9), 



(2.48) 



with equality if and only if the set of output states satisfies (2. 41) and a fixed 
unitary matrix U = [ujk] exists such that the Kraus operators Ej are related 
to the canonical Kraus operators by 



(2.49) 
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Proof. Inequality (2.48) follows from (2.31) and the same analysis used in 



the proof of Theorem 2.4 with replaced by Ef. 



Equality holds in (2.48) if and only if, for the set of channels (2.43), H t \ t=0 = 
CE{t)\ t=0 for all v. From Theorem 2.3 this is satisfied if and only if the out- 



put states of the channel satisfy (2.26) at t = and the Kraus operators E 



are related to the canonical Kraus operators Tf, by 

Ej(9 + tv)\ t=Q = J2 u A e + tv ) T k(0 + tv) 



t=o ' 



where 



jk 



du jk 




dt 


t=o 



(2.50) 



From the proof of Theorem 2.4 it can be seen that for channels of the form 



(2.43), satisfying (2.26) at t = is equivalent to satisfying (2.41). Condition 



(2.50) can be rewritten as 



jk 



Edujk 1 



This is satisfied for all v if and only if a unitary matrix U = [ujk] exists 



satisfying (2.49) that does not depend on 9. 
Theorem 2.7 For multi-parameter channels, 

F e M < C E (9), 



(2.51) 



with equality if and only if the set of output states satisfies (2.4-1), a fixed 
unitary matrix U = [ujk] exists such that the Kraus operators E~ are related 



to the canonical Kraus operators T k by (2.49) and there exists a POVM 



satisfying (2.41). 



Proof. This follows from Theorems 11.11 11.21 and 2.6 
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Chapter 3 

The bound of Sarovar and 
Milburn as a metric on the 
space of quantum states 



3.1 Introduction 



Various statistical notions can be expressed in differential-geometric terms 



(Amari and Nagaoka, 2000). This area is sometimes known as 'informa- 



tion geometry'. Of special importance is Fisher information, which is the 



unique monotone metric on the space of probability measures (Morozova and 



Cencov, 1990). However, there is no unique monotone metric on the space 



invariance were given below ( 1.50 ).) 



of quantum states (Petz and Sudar 1996). (Definitions of monotonicity and 



The following theorem of Morozova and Cencov (1990 ) is of great interest. 



Theorem 3.1 A Riemannian metric is invariant if and only if at every 
density matrix 

P = ^2Pj\j)(j\, 

3 

the squared length of any tangent vector A is of the form 



C^-^ 2 + 2^c{p j ,p k )\A jk \\ A jk = (j\A\k), 

„• Pi „W„ 



(3.1) 



j<k 



where C is a constant, c(ax,ay) = a 1 c(x,y) and c(x,y) = c(y,x). 



This result was augmented by the following theorem of Petz and Sudar 
(p96|. 
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Theorem 3.2 A Riemannian metric on the space of quantum states is mono- 
tone if and only if at every density matrix 



P 



the squared length of any tangent vector A is of the form ^3. 1\} and the 
function f(t) = l/c(t, 1) is operator monotone. (A function f(t) is oper- 
ator monotone if for self-adjoint n x n matric es A and B, with A < B, 
f(A) < f(B)> (Bengtsson and Zyczkowski, 2006, Section 12.1).) 

For parametric families of states, put A = dp/d9. In this case (3.1) 
becomes 



C ^Pi 



dp 



dp 
d§ 



(3.2) 



1996) 



For the SLD, KMB and RLD quantum informations (|Petz and Sudar 
C 



1 and 



csld(x,v) 
ckmb(x,v) 
crld(x,v) 



x + y 

In x — In y 



x - 

-(- 

2\x 



y 



For a more thoroug h background to the theory of m etrics on the space of 



quantum states see (Bengtsson and Zyczkowski 



2006 



Chapter 14). 



The Symmetric Logarithmic Derivative (SLD), Kubo-Mori Bogoliubov (KMB) 
and Right Logarithmic Derivative (RLD) metrics (see Section 1.9) are the 
most frequently encountered monotone metrics in recent literature. The SLD 
quantum information is the minimum monotone metric on the space of quan- 
tum states (Petz and Sudar, 1996). It has been used widely in the estimation 



of states (Helstrom 



channels (Fujiwara 



1967 



2001 



1976, Holevo, 1982, Hayashi, 2005) and quantum 



2002, 2004 Fujiwara and Imai 2003 Ballester 



2004a]bl ). For one-parameter families of states, the SLD quantum informa 



tion is equal to the maximum attainable Fisher information (Braunstein and 



Caves, 1994). The SLD quantum information is related to the bures distance 

l-tr{ v / p 1/ Vp 1 /2} ) 



in the following way (Hayashi, 2006b, (6.23)) 

b 2 (p e ,pe+e) 



H 



8 lim 

e->0 



(3.3) 



(3.4) 
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The bures distance is a quantum analogue of the Hellinger distance 



dl(p\W) = l-Yl 



(3.5) 



i=l 



where p = (pi, . . . ,Pk) and q = (qi, . . . , The result (3.4) is interesting 
since, given a probability distribution p e = {pi(9)}, the 'classical' Fisher 
information is related to the Hellinger distance by 



8 lim 



d!(p<?lb. 



: e+c 



(3.6) 



The KMB quantum information is equal to the limit of the quantum 
relative entropy D(p\\a) = tr(p(lnp — lner)) (Hayashi, 2002). That is, 



H, 



K 



lim 

e-S>0 



2D(p \\p g+£ ) 



(3.7) 



This is analogous to the fact that the 'classical' Fisher information is the 
limit of the 'classical' relative entropy D(p\\q) = ^^ =1 Piln(pj/gj), where 
p = (p u ...,p h ), q = (q 1: 
Pe = {pi(P)}, 



qk). That is, given a probability distribution 
2D(p e \\pe+e) 



lim 

£->0 



(3.J 



The RLD quantum information is the maximal monotone metric on the 
space of quantum states (Petz and Sudar 1996). It has also been used in 



estimation theory ( |Fujiwara 1994). 

In Chapter [2] it was shown that Sarovar and Milburn's bound Cr(0) for 
one-parameter channels could be generalized to a Riemannian metric on G. 
In this chapter Cy(6) will be referred to as the SM quantum information. It 
seems natural to look at the properties of Cy(9). Is it is well-defined? Is it 
useful? 

In this chapter it is shown that the SM quantum information is not a 
well-defined metric, since different choices of phase of the eigenvectors lead 
to different metrics. A new metric Cl is defined from CV- Properties of Cl 
are investigated and it is seen that it is invariant but not monotone. 



3.2 Analysis of the SM quantum information 

The SM quantum information for the family of states 



X>(0)K(0)>K(0)l 



(3.9) 



k=i 
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was shown in Proposition 2A_ to be equal to 

2 



a 



T 



E 1 



dpi 

Pi v ^ 



4 S(Pi+Pfe)K w iHfc)l ! 



j<k 



+ 4^p i |(w> i )| 2 . 



(3.10) 



This can be rewritten as 

c - = E^ 



dpi 
Pi V d6 



+*E 



Pi + Pfc 



j<k 

2 



(pj -PkY 





dp 






de 





(3.11) 



It can be seen that Cy(9) is not of the form (3.2), and hence is neither 



invariant nor monotone. The SM quantum information Cf{9) for a family of 



states is defined in terms of its eigenvectors and eigenvalues by (3.10). The 
eigenvectors of a state are unique up to a change of phase. It turns out that 
different choices of phase for the eigenvectors lead to different metrics. 



Example 3.1 Consider the set of 2- dimensional states 

1 / 1 + r cos 9 r sin 6e~ % ^ 
Prfi ^ = 2 V r sin 9e* 1 - r cos 9 

with 0<r<l, O<0<7r and < <p < 2tt. Any qubit, mixed or pure, can 



(3.12) 



be written in the form (3.12\) with specific values of r, 9 and (p. Each state 

w 2 (9,(j)))(w 2 (9,4 



(3.12) has spectral decomposition 
1 + r 



Pr,6,<j> 

MM)) 
MM)) 



w 1 (9,cj ) ))(w 1 (9, 



+ 



(cos(#/2)e-^ /2 , sm(9/2)e l<t,/ 
(sin(#/2)e^ /2 ,-cos(#/2)e 



2\T 



id>/2\T 



The SM quantum information for the family of states 
above eigenvalues and eigenvectors is 



calculated from the 



( 



1 



C r (r,0, 







\ 



1 _ r 2 

1 
1/ 
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w k { 



Changing the eigenvectors by the phase shift e - * 9 ^ 2 , i.e. \wk(0, <p)) e - *^ 2 
leaves the density matrix unchanged but the SM quantum information calcu 
lated from the eigenvalues and shifted eigenvectors becomes 



( 



1 



Cy(r,6, 



1 _ r 2 












1 

2 + 2rcos# / 



Hence the SM quantum information is not a well-defined metric. 



3.3 A new metric 



The Cl quantum information for the family of states (3.9) will be defined as 

C L = C r -Aj2PiM\^\ 2 - (3-13) 



Thus 



Pi V dO 
1 / dpi 



j<k 



( w 3 


dp 






de 


Wk^j 



(3.14) 
(3.15) 



Remark 3.1 Unlike the RLD and KMB quantum informations, the Cl quan- 
tum information can be defined for families of pure states. For pure states, 



C L (p e ) = H(p e ) (see (3.21)) 



The Cl quantum information is of the form (]3.2|) with C = 1 and 

CL(Pj,Pk) 



Pj + Pk 



(Pj 'Pk) 2 ' 



(3.16) 



This function is symmetric and CL(ctx,ay) = a~ 1 CL(x,y). Hence, Cl is 
invariant. Thus it does not suffer the same defect as Cr- The Cl quantum 
information provides each parameterized family {pg : 9 e 0} with a unique 
Riemannian metric on O. 



For a metric to be monotone, it must be of the form (3.1 ) and the function 



fit) associated with the metric must be monotone and satisfy f{t)=tf{t x ). 
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The functions associated with the SLD, KMB and RLD quantum informa- 
tions are 

fsLD{t) = — — 

fx MBit) = - 

log* 

fnLD{t) = Y^t' 
Calculation shows that the function associated with Cl is 

If / is a monotone function then /(0) < f(t\) < f(t2) whenever < t\ < t 2 . 
The function fc L (t) satisfies fc L if) = tfc^t^ 1 ) but is not monotone, as 
/cl(0) > Hence, Cl is an invariant but not monotone Riemannian 

metric. 



Example 3.2 The depolarizing channel, (1.33), acts on 3- dimensional states 
in the following way 

p ^ (1 - e)p + |l3, < e < 1. (3.18) 

Consider the one-parameter set of 3- dimensional mixed states 

pe = (1- 25)|wi) (w 1 \ + S\w2){w 2 \ + S\w 3 )(w 3 \, 

\wi) = (1,0, 0) T , 

\w 2 ) = (0, cos#,sin#) T , 

\ w s) — (0, — sin^, cos#) T , 

where 5 is fixed. The Cl{9) quantum information of this family of states is 
85. Under the action of the depolarizing channel the set of output states is 

£{ Pe ) = ((l-e)(l-28) + ^j\w 1 )(w 1 \ 

+ ((l-e)6+^y W2 )(w 2 \+((l-e)5 + ^\w 3 )(w 3 \ 

with \wi) unchanged. The Cl quantum information for the family of states 
E(pe) is 85 + 8e(l/3-5). Now 

C L (£(p e )) ~ C L ( Pe ) = 8e(l/3 - 5). (3.19) 

For e > and 5 < 1/3, Cl has increased under the action of a TP-CP map, 
thus demonstrating the non-monotonicity of Cl- 
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3.4 Ordering of d, Cr and H 

Theorem 3.3 Given a parameterised quantum model {pg = Ylk=i Pk{@)\ w k{9)) ( w k 
6 G W}, 

H e < C L (6) < C r (6), (3.20) 
where the multi-parameter versions of Hg, Cl(6) and Cy(9) are defined in 



families of states pg if and only if 



( 3.44\ ), {3.3ty and \3.2tfy respectively. Equality holds in H e < C L {6) for 

0, Vm, j^k,pj,p k >0. (3.21) 



w 



(m) 



Equality holds in Cl{&) < Cy(8) for families of states pg if and only if 



(m) 



W; 



0, Vm, i, pi > 0. 



(3.22) 



A proof of Theorem 3^3 will be given first for the one-parameter case and 
then for the general case. 

3.4.1 One-parameter case 
Lemma 3.1 

C L (9) < C r {6), (3.23) 

with equality if and only if 

(w' t \wi} = 0, Vi, Pi > 0. (3.24) 



Proof. This follows from the definition of Cl, (3.13), and the fact that 
|(u^|u>i)| is non-negative. 



Lemma 3.2 

Hg < C L (6), 

with equality if and only if 

(w' j \w k ) = 0, Vj ^ k,pj,p k > 0. 



(3.25) 
(3.26) 
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Proof. Proposition |2.2| showed that 



and hence 



1 / dp h 



Pk V dd 



E 4 

j<k 



Pj + Pk 



IKK)f 



Cr-H 



j<k 



Pj + Pfe 



(3.27) 



since | ( w'^Wk ) | 2 is symmetric with respect to j and k ( |2.15 ). The right hand 
side of (3.27) is non-negative, and equal to zero if and only if (3.26) holds. 



3.4.2 The multi-parameter case 



Proposition 3.1 In the multi-parameter case the SM quantum information 
is the matrix with entries 



T kl 



i<j 



W 



Wj > < Wj 



w 



(I) 



+ 4 £ 



.(*) 



Wi }{ w 



w 



(0 



(3.28) 



Proof. The multi-parameter version of Cy was defined, (2.33), as the matrix 
with entries 

(C?) kl = 4 J>r {TfVoTf f } , T« = ^T, 



Using (2.8) and (2.9) 



i <^ / (0 



4^ &e k de l 



w 



(fc) 



+ 2 



(fc)\ , / (i) 



•U'i 



(3.29) 



The contributions of the final two terms on the right hand side of (3.29) are 



zero since they are purely imaginary (see below (2.12)). Thus 



w 



(0 



w 



(AO 



(3.30) 
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Inserting the identity 1^ = YE \wj)(wj\ into the second term on the right 



hand side of (3.29) gives 



E 



,(0 



w, 



(k) 



^2 i>i' 



(0 



+ E 



(3.31) 



The first term on the right hand side of (3.31) can be written as 



5Zm "' 



(I) 



w 



(k) 



i<j 



(I) 



Wj {Wj 



w 



+ Z> 



i>j 



(J) 



Wj HWj 



«f }• (3.32) 



Swapping the indices, i and j, on the second term on the right hand side of 



(3.32) gives 



i>j 



(0 



Wj HWj 



W 



(k) 



(0 



Vl^pjlwj 



Wi ){Wi 



(k) 
w) 



i<j 



(0 \ / (fc) 



V(3.33) 



using (2.14). From (3.31), (3.32) and (3.33) it follows that 



*E 



,(*) 



(*) 



Wj { Wj 



w 



(I) 



+ »E 



,(*) 



^ >< m 



(0 



(3.34) 



The required result follows from (3.30) and (3.34). 



The multivariate version of Cl will be defined as the matrix with entries 



{c L )ki = (c r ) kl -mJ2 



Pi{ w 



,(*) 



Wi >< w 



w 



(I) 



E 



1 / dpi \ ( dp 



Pi \d03 ) \de k 



+ m^{pi + pj) 



i<j 



Wj HWj 



w 



(3.35) 
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Lemma 3.3 



C L {9) < C T (0). 



with equality if and only if (3.22) holds. 



Proof. Equation (3.36) is equivalent to 

v T C L {9)v < v T Cy(9)v, 



(3.36) 



(3.37) 



for all v G MP. For given 9 and v in M. p , consider the set of one-parameter 
states 

d 

^p k {9 + tv)\w k {9 + tv))(w k {9 + tv)\, t e R. 



hv 



k=l 



It was shown in the proof of Proposition |2.3| that 
d_ 
dt 



d 



i 

w k (9 + tv)} = 

i 

d_ 
d9~ l 



w k {9) 



(I) 



d9 l 

w k (9)W)v l + 0{t), t->0, 
w k (9)), 



(3.38) 
(3.39) 



where v l is the hh component of the vector v. From Lemma 3^ it is known 
that C L {t) < C r (t), i.e. 



(9 + tv) + Pk (9 + tv)) 



dwj 



i v ' x ' j<k 



dt 
dwi 



dt 



w k 



w k 



+ 4j>W 



/ dwi 




\~dT 





(3.40) 



Using (3.38) and (3.39) and evaluating at t — gives 
1 / dpi \ ( dp. 



v m v n 



E 

m,n 



E 



Pi \09 m J \89 



E 



1 / dpt \ ( dp 



Pi \d9 r J \d9 



J +4^fe+ Pj ) 



W 



(m) 



Wj >< Wi 



w 



(n) 



W 



(r) 



w n - M w 



w 



+ ^Pi 



w 



Wj. )( w 



w 



(«) 
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This can be rewritten as 



r.s 



This is equivalent to (3.37). Since this holds for all v in W, (3.36) holds. 



Equality in (3.36) is equivalent to 

v T C L {8)v = v T C r {8)v, 



(3.41) 



for all v G M. p . From the proof of Lemma 3.3 it is seen that (3.41) is satis- 
fied for all v G M. p if and only if, for one-parameter families of states pe+tv, 
Cx(i)|t=Q — Cr(t)\ t =o- From Lemma 3.1 this is possible if and only if the 
channel satisfies (3.24) at the point t = 0. This condition is equal to 

2 

5>i(*) 



dwi 
~dt 



w. 



0. 



(3.42) 



t=o 



Using (3.39), this condition can be rewritten as 



i=i 



Pi{ w. 



,( m ) 



Wi M W 



W 



(n) 



0, Vm, n. 



(3.43) 



Condition (3.43) holds for all v if and only if (3.22) is satisfied. 



Proposition 3.2 In the multi-parameter case the SLD quantum information 
is the matrix with entries 



(pi-PiY ( w {k) 
Pi + Pj 



Wi ) < w 



(I) 



(3.44) 



Proof. In the multi-parameter particular choice of SLD with respect 

to the parameter 9 k is 



A 



E 



1 dpi 



Wi ){ w 



+ E 



Pi + Pj 



Wj 



Wi ){ w 



. (3.45) 



Proposition 3.2 follows almost identically to the one-parameter case (see proof 
of Proposition 2.2). 
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Lemma 3.4 



H e < C L (6), 



(3.46) 



with equality if and only if (3.21) holds. 

Proof. This follows from Lemma 3J2 in the same way as Lemma 3^3 follows 
from Lemma 13. 11 



3.5 ft as the minimum of Cj 



Example |3.1| showed that for Cy , different choices of eigenvectors of pg can 
result in completely different metrics. Since Cy is an upper bound on Fisher 
information, it seems sensible to choose the minimum among possible values 
of Cy. It will now be investigated whether there exists a choice of eigenvectors 
such that Cy = Cl- 

3.5.1 One-parameter case 

Consider a family of states pg = ^2 i Pi(9)\w i (9))(wi(9)\. A phase change of 
the eigenvectors \ wi(9)), . . . , \ wd(9)) sends these vectors to \vi(9) },..., \vd(9)), 
where \vj{9)) = e laj ^\wj(9)) for some real-valued functions a±, . . . ,otd- The 
density matrix pg is unchanged. Now 

d \ Vk {9)) = i^e ia ^w k (9)) + e ia ^i-\w k (9)) 



and hence 



Choosing 



d9< y " d9 1 * v Jl d9< 



I 1 1 \ .do^k . I 1 1 v 



a k (9) = -i (w' k ((f))\w k ((j)))d(j), 
Je 



(3.24) is satisfied. (Since (w' k \w k ) is purely imaginary, a k is real.) Thus in 



the one-parameter case Cl is the minimum among Cy. 
3.5.2 Mult i- parameter case 

A phase change of the eigenvectors \wi(9) ),..., \wd(9)) sends these vectors to 
\vi(9)), . . . , \vd(9)), where \vj(9)) = e iaj ^\wj(9)) for some real-valued func- 



tions ai, . . . , ad- In this case 9 = (9 , . . . , 9 P ). Equality holds in (3.36) if and 
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only if ( [3722] ) holds. Now 

d 



. dan 



^m)=^^\w {e))+^ m 



_d_ 
d~9" 



and hence 



.dotj I f m ) 

QQm \ 3 



W , 



This is zero if and only if 



.da. 



d Wj 



Wj ) G ill 



09 m \d& 
This is solvable for a±, . . . , if and only if 



Wj, m. 



d 2 an 



d 2 OLj 



QQkQQl QQlQQk 



Wj, k, I. 



This is equivalent to 



d I dwj 



d9 k \ 89 l 
which is equivalent to 



w , 



d I dwj 
d9 l \W 



wj ) Vj, k, I, 



I & W 3 

\ 89 k d9 l 



dwi 



d9 l 



dw 3 

36 k I \d9 l 89 h 



I Q 2 Wj 



Wi + 



I d w 3 



dwi 



d9 l 



Since \wj) is assumed to be continuously differentiate, 



d 2 Wi 



QQkQQl 

and hence it is required that 



w, 



I Q 2 Wj 



\89 l d9 k 



Wj) Wj,k,l, 



d Wj 



39 l 



dwj \ 



d Wj 



39 h 



dwi 



Wj,k,l. 



This is satisfied if and only if 



d Wj 



89 l 



d Wj 



39 k 



Wj,k,l, 



(3.47) 



which, in general, does not hold. Hence, for multi-parameter families of 
states, Cl is not generally the minimum among C*x- 
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Example 3.3 For the family of states given in Example 3.1 

sin(0/2) cos(0/2), 
sin(0/2) cos(0/2). 





<9u>i 


90 


d(j) 




dw 2 


90 





l 

2 

— z 

T 



Since (3.41) is not satisfied, Cl is not the minimum among Cy, for this 
family of states. 



3.6 Relationship between Cl and SLD infor- 
mation of mixtures 

For a general family of states pg = J2iPi(.@)\ w i(@)) ( w i(@)\i Cl was defined in 



(3.35) as the matrix with entries 
{C L ) H = (C r ) kl -4J2 



Pi{ w, 



,(*) 



Wi > < w. 



w 



(0 



y-v 1_ ^ dpi \ f dp t 



Pi \de k J \de l 



+ 4^>A 



(*) 



(0 



(AO 



Wi ){ w 



w 



(0 



(3.48) 



by (3.30). It is not difficult to show that the SLD quantum information for 



Pi{6) = \wi(6)){wi{6)\ is the matrix with entries 



{H{pi)) kl = mlw. 



,(fc) 



w 



(I) 



w 



(AO 



Wi ) { w 



w 



(I) 



Thus, 



Fg(p) + J2Pi H (Pi 



(3.49) 



(3.50) 



where Fg(p) is the Fisher information matrix for p = (pi, . . . ,pd), which has 
entries 

<™>"=eKpO(£0- (3 - 51) 



The result (3.50) states that the Cl quantum information is equal to the 



classical Fisher information of the probability distribution {pi, . . . } pd} plus 
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a weighted sum of the SLD quantum informations of the pure states p%{9) of 



which the state pe is a convex mixture. From (3.48) it can be seen that for 



pure states, for which there is only one non-zero Pi, Cl = H, and so 



ME 



Pipi 



F (p) + Y,PiCL(Pi). 



(3.52) 



Note that (3.50) and (3.52) are analogous to (7.4) of Amari (1982): Given 
random variables X and Y depending on 9 with 



where 



f{x,y; 9) = g(x;9)h(y\x;9), 



^XY 



F 9 x + E x [F d Y \x 



(3.53) 
(3.54) 



-,X,Y 



Ex[F e Y \x 




f(x,y;9) 



dlogf(x,y; 



d9 



dxdy 



9{x; 



dloggjx; 
d9 



dx 



g{x-9) / h(y\x;9) 



d\ogh(y\x] 9) 
d~9 



dy I dx. 
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Chapter 4 

Simultaneous estimation of 
several commuting quantum 
unitary channels 



4.1 Introduction 

The situation in which there are n non-identical commuting channels which 
are 'dependent' (having the same parameter but different forms) will be con- 
sidered. This chapter introduces the idea of estimation of different commut- 
ing unitary channels simultaneously, as opposed to estimating them sepa- 
rately. Using the SLD quantum information as a measure of performance, it 
will be shown that this can give considerable improvement over estimating 
the channels individually. 



4.1.1 Estimation of unitary channels 

Estimation of an unknown or partially unknown unitary channel has received 



a lot of attention recently, see 


Rudolph and Grover 


( 


2003 


)■ 


Ji et al. 


(2008), 


Acm et al. 


2001 


)• 


Bagan et al. 


(2004ap 


)), 


Ballester 


(2004a||b 


>, 


de Martini 


et al. 


(2003) 


, Fuji war a 


(2002 


)• 


Hayashi 


(- 


>006a 


). Almost every quantum in- 



formation protocol assumes perfect knowledge of a quantum channel. In 
practice, knowledge will be imperfect; hence estimation of quantum channels 
has to precede most other quantum information schemes, and its optimiza- 
tion is of fundamental importance. 

It will be assumed that the unitary channel comes from a parametric fam- 
ily of channels. When estimating a parameter 9 in a one-parameter model, 
the SLD quantum information Hg will be used as a measure of performance. 
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When Hg is attainable, i.e. there exists an M such that F 9 M = Hg, the fol- 
lowing result is of importance: As the number of observations iV — > oo, using 
the POVM M and an unbiased maximum likelihood estimator, 

NE[(9 - Of] -> -L (4.1) 
ng 



(Van der Vaart , 1998 , p. 63). When dim 9 > 1, the performance of estimation 
will be quantified using the trace of the SLD quantum information (tr{H }). 
When dim# > 1, the SLD quantum informations for different parametric 
families of states may be incomparable. That is, given two families of states 
Pg 1 ^ and pf\ with SLD quantum informations Hg and Hg 2 \ it may be that 
H<p t H ( e 2) and H^ £ Hf \ The quantity ti{H } is useful since (jBallester 



|2004a[ ) 



(i) it treats the parameters 9 , . . . , 9 P with equal importance, 
ii) if tr{^ 1} } > tr{^ 2) } then Hf\ 



The output state will be measured using POVMs which satisfy ( 1.69[ ) (possi- 



bly using an adaptive measurement), and an estimate of 6 and hence Ug will 
be obtained using the maximum likelihood estimator. 



Previous work in estimation (see Section 1.11.1) has looked at the case 
where there are n copies of some Ug. In this chapter a more general problem 
is considered: given n channels which are not identical, is it better to estimate 
each of them individually or is it possible to improve on this by using the 
channels in parallel, as in the case of n identical channels? It may be that 
in practice, more commonly, there are n channels which are different (but 
functionally dependent) than n channels which are identical. 

In this chapter the performance of estimation will be considered as a 
function of N, the number of times each of the n channels is used. It will be 
assumed that each channel can be used only once on each input state. 



4.2 Simplifying Matsumoto's equality condi- 
tion 



The following result will simplify later calculations. 



It was mentioned in Theorem 1.3 that for pure state models pg = \ipg){ijjg 



there exists a POVM and estimator such that equality holds in the quantum 
Cramer- Rao inequality, ( 1 .79 ) , at 9 — 6q if and only if 



%l 3 {9)\l k {9)) =0, Wj,k, 



(4.2) 
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where \lj{6)) = \ 3 e \ipe) (Matsumoto 



1997 



Fujiwara 



2002 



Matsumoto 



2002). 



An equivalent condition which is simpler to check, and will be used in this 
chapter, is given in the following lemma. 

Lemma 4.1 For families of pure states pg = \ipg) (ipe\, equality holds in the 
quantum Cramer-Rao inequality, (1.79), at 9 = 6 if and only if 

Z(^ ] \4 k) ) = a Vj,fc, (4.3) 

where = d\ip e }/d9 j . 

Proof. For pure states, equality holds in (1.79) at 9 = 9 if and o nly if (4.2[) 



is satisfied. Now \lj{9)) = \ 3 e \ipg) is independent of the choice of X J 9 (Fujiwara 



2002, Appendix A, before (7)). A possible choice is 



XI = 2d Pe /d9i = 2(|^' ) )(^| + \ipe)(ip^\). 
A little algebra gives 

(IMM9)) = 4((#|# ) > + (tfU)<#U>)- 

The second term is always real, since (ipg \ip) is purely imaginary for all I 
(see below (2.12)). Thus condition (4.2) is equivalent to condition (4.3). 

Remark 4.1 Although (ipg\ipg) depends on the choice of phase of \ipe), 
^(ifj^li/j^) does not. 

Lemma 4.2 If \xi), . . . , \x n ) G C d such that 
(i) \xi), . . . , \x n ) are IL-linearly independent, 
(ii) (xj\xk) € R for all j,k — 1, . . . ,n, 
then n < d. 

Proof. Suppose that 3 a%, . . . , ot n G C such that 

n 

Y^ a i\ x i) = °- 

Putting atj = aj + ibj, where aj,bj e R for j = 1, . . . , n, gives 

n 

^2(dj + ibj)\xj) = 0, 

3=1 



'.Cj)i 



(4.4) 



(4.5) 
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and so 



From condition (ii), 



y^(gj + ibj)(xk\xj) = 0, for all k. 

3=1 



(4.6) 



Thus 



and so 



^^aj(xfc|xj) = 0, for all k. 

3=1 



^ %a fc (a; fc |x i ) = 0, 



3=1 



0. 



Since by (i) jxi), . . . , \x n ) are IR-linearly independent, aj = for j 



Similarly (4.6) gives bj — for j — 1, . . . , n. Thus a,- = for j 



1, . . . ,n. 
1, . . . ,n. 

Consequently, \xi), . . . , are C-linearly independent. Therefore, if \xi), . . . 
C d satisfy (i) and (ii), then n < d. 

Theorem 4.1 For a d- dimensional non-degenerate family of pure states pg = 
\^e){' l ^e\, ® — ■ ■ ■ ; ^ p ); Hq is attainable only if p < d — 1. 

Proof. The vectors {\lj(0))}, where \lj(9)) = \ g \ipe) are M-linearly indepen- 
dent (due to the nondegeneracy of the parameterization 9 i— > p g ) (Fujiwara 



2002 



Appendix A). Since (lj(9)\^ e ) = ti{p e X g } = for all j, the vectors 
{\i/jq) ,\li(9)) , . . . ,\l p (9))} are also IR-linearly independent. From (4.2) it is 



\x n ) e 



seen that He is attainable if and only if the set of vectors {\i^e), |^i(#))> ■ • ■ ■> \lp(9)}} 



satisfy conditions (i) and (ii) in Lemma 4.2 It follows from Lemma 4.2 that 
Hq is attainable only if p < d — 1. 

Remark 4.2 As any unitary channel can be specified by d 2 — 1 parameters, 



Theorem 4-1 shows the importance of enlarging the Hilbert space to estimate 
a completely unknown U G SU(d), i.e. letting Id® U act on a state \<p) G C d2 . 
In this case it is possible to have a maximum of d 2 — 1 parameters such that 
Ha is attainable. 



For the channels considered in this chapter an extension of the form 1^®^ 
does not increase the maximum attainable Fisher information. 
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4.3 A 2-dimensional family of non-identical 
channels 

Consider the following set of 2-dimensional channels, which are all functions 
of the parameter 9, 

U 9 = ( n „ih(9) ))•••> U 6 = [ n „if n (6) J ' ( 4 - 7 ) 



e </i(9) J ' • • • ' e ~ ^ e i/n W 

where < # < g, for some g, and /j- : R — >■ R. The following conditions are 
imposed on the functions ff 

(a) ^>C 

(b) O<£./,(0)<7T, 

for all j and 9. 

Remark 4.3 Throughout this chapter similar restrictions will be given on 
the unitary matrices to be estimated. Condition (a) means that as 9 is in- 
creased the angle through which states are rotated is also increased, though the 
amount by which the phase increases varies from unitary to unitary; condition 
(b) can be thought of as having some prior information about the phases to 
be estimated, possibly through a knowledge of the experimental arrangements. 

The SLD quantum informations of the schemes 

(i) letting each of the n channels act on identical copies of \ip x ) = l/y/2(\0)+ 
|l»,i.e. 

(ii) arranging all n of the channels in parallel and using the entangled input 
state \ip) = l/v / 2(|00---0) + 1 11 — 1>) G C 2 ", i.e. 

W .-►(£# ®...®£tf)M, (4.8) 

will be compared. If U 3 e acts on the state \ip x ), the output state is l/\/2(|0) + 
g^/iC 61 ) 1 1) ) . This gives H 3 e = (dfj(9)/d9) 2 , which is attainable by measuring 
in x, i.e. using the POVM M x = {M = \i/j x )(i/j x \ : I - M }. Thus for the n 
channnels 



= E ( <am ) (4 , 9) 



65 



and is attainable. An estimate 9^ is obtained using the maximum likelihood 
estimator. 

Now the n-partite input state will be considered. The output state is 
1 A/2 (| 00 • • • 0) + e iE ?=! f ^ e) \ll- ■■!)). Computation gives 



H. 



iii) 



dJW) 

de 



(4.10) 



which is attainable using the POVM M = {M = \if>) (if)\,I - M }. Because 
of conditions (a) and (b), 9 can be identified. An estimate 9^ is obtained 
using the maximum likelihood estimator. 



The SLD quantum informations (4.9) and (4.10) may look similar, but 



they are not. The position of the bracket makes a considerable difference. 
From condition (a) on the functions fj, (4.10) is considerably larger than 
(4.9). For example, in the case when fj(9) = 9 for all j, the SLD quantum 
informations are Nn and Nn 2 , respectively. 

A consequence of this is that the asymptotic limit of the mean square 
error is considerably smaller using approach (ii). The asymptotic limits of 
the mean square errors for approaches (i) and (ii) are, respectively, 



i) 



NE[(9 { 



NE[{9 {ii) - 9f 



1 



— > 



— >■ 



^ \ d9 

j= i v 



E 



d9 



2 ■ 



(4.11) 



(4.12) 



4.3.1 Sequential method 

Here it will be shown that, without using entanglement, it is possible to 
obtain the same SLD quantum information for the set of non-identical chan- 
nels (4.7), as was obtained in approach (ii). A third scheme for estimating 



the set of channels (4.7) will be introduced, which will be referred to as the 



sequential scheme. The sequential scheme makes no use of entanglement. 



(iii) The channels (4.7) are each used once on the same separable input 
state \if) x ), i.e. 

m H- U%---U 2 e U l e \il> x ) (4.13) 
= ^(|0> + e^^>|l>). 
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Calculation gives 

and is attainable by measuring in x. An estimate 9^ is obtained using the 
maximum likelihood estimator. The SLD quantum information obtained in 
approach (iii) is equal to that obtained in approach (ii), thus will have the 



same asymptotic limit on the mean square error, (4.12) 



4.4 A more general family of one-parameter 
channels 

Often physicists are interested in unitary channels parameterised as Vq = 
exp(i9H), where H is an observable related to the energy in a system, known 
as the Hamiltonian. This seemingly simple channel has many examples in 
interferometry and measurement of small forces. (For more on channels of 



this type see Giovannetti et al. 2006, and the references therein.) Consider 



n ci-dimensional unitary channels parameterised as 

U} = exp(ifj(e)H), l<j<n, (4.15) 

where < 6 < q, for some q, fj : R — > R for all j. The following conditions 
are imposed on the functions f): 

<a)*f>0, 

(b) 0<£,/^)<7r, 

for all j and 9. The problem of finding the optimal input state will not be 
considered. The SLD quantum informations for 

(i) letting each of the n channels act on identical copies of some j^o); i- e - 

(ii) letting each of the n channels act on the same separable state \ipo), i.e. 

\1>o)^U?---U$U}\iJ> ) 
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will be compared. The SLD quantum informations for (i) and (ii) are, re- 
spectively, 



(0 



H. 



(») 



4t 



dm 

de 



[<0 o |tf 2 |0O> - (0o|#|0O>' 



[(0 o |^ 2 |0o)-(0o|^|0o) S 



(4.16) 
(4.17) 



Because of condition (a) the SLD quantum information of (ii), given by 
(4.17), is considerably larger than that of (i), given by (4.16). These results 
hold for all choices of input state \<f>o). 



4.5 A d- dimensional family of non-identical 
channels 

The situtation of having n 'dependent' rf-dimensional commuting channels 
will be considered. These will be parameterised in a similar way to that 
used by Ballester (2004a). Ballester (2004a) looked at commuting unitary 
channels. Any commuting unitary channel can be specified using d — 1 pa- 
rameters, i.e. by a parameter 6 = (0\, . . . , 6 d _i). Given a set of d x d matrices 
t)~, k = 1, . . . , d — 1, satisfying 



(i) U 



t 



(ii) tr{*jfe} = 0, 

(iii) tr{t k ti} = S k i, 

(iv) t k U = t t t k , 

Ballester parameterised the set of commuting unitary channels as 

d-l 



U e = exp I i ^ 6 k t h 



(4.18) 



k=l 



Since, from (iv), t k and U commute, they share a basis {\w k )}, which is 
assumed to be known. Consequently, any t m can be written as 



tin C m j|lOj) . 



(4.19) 
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From condition (i) it follows that c m i G R for all m, i. Conditions (ii) and 
(iii) give 



d 



J^Cmi = 0, (4.20) 

i=l 

d 
i=l 

Ballester showed that there is no advantage in extending Ug and using a 
maximally entangled input state. The maximum value of tr{Hg} can be 
attained using the separable state 

1 d 

Consider the set of channels 

U° e = exp (i fj{0k)t?J , 1 < j < n, (4.23) 

where < 9 < q, for some q, fj : R — > R and fj(9 ) = for all j. All 
n channels depend on the parameter 9 = (#i, . . . , 9d-i), and each channel 
depends on every component of 9. The following conditions are imposed on 
the functions ff 

(b) o<E, /,W<^ 

for all j, /c and The traces of the SLD quantum information for 

(i) letting each of the n channels act on identical copies of l^sep) given in 
fl4~22| ), i.e. 

\lpsep) ^ U 3 e \ip sep ), 

(ii) letting each of the n channels act on the same separable state \ip sep ), 
i.e. 

\^sep)^U^---U 2 6 Ul\^ sep ) 

will be compared. 
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Proposition 4.1 The traces of the SLD quantum informations for (i) and 
(ii) are, respectively, 



■ =1 .7 = 1 V / 



1=1 j = 



(4.24) 
(4.25) 



J = l \j = l / 

From condition (a), the trace of the SLD quantum information of (ii), given 
by (4.25), is considerably larger than that of (i), given by (4.24). 



Proof. A proof will be given for (4.25); the proof for (4.24) is very sim- 
ilar. The jth unitary channel will be denoted by U 3 e . As the U J e commute, 



d-l 



f[U J g = exp <^ i^2g k (0 k )t k \ , g k (0 k ) = ^fjtfk)- 



(4.26) 



fc=i 



Using each of the n channels on the single input state (4.22) gives the output 
state 



d-i 



\i/>o) = [ Yl u e ) l^sep) = exp <j i^2g k {6 k )t k [> \ip sep ). 



(4.27) 



k=l 



An arbitrary diagonal element of Hg is equal to 



09 r 



(^ m) l^ m) ) - K^ m Wr] , l4 m) ) = We) far- 

l^m^m | Ipsep) | (ipsep \tm \lpsep) | ] 

1 d 



dg r , 



d6 r 



dg r , 



dQ r , 
4 / dg m ( 



■"ink 



k=l 



k=l 



by (4.19) 



d \ d9 r , 



by (4.20) and (4.21). 
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Thus 




Proposition 4.2 The SLD quantum information (4-25) is attainable. 



Proof. The set of output states is given by (|4.27|). Now 

dg m (O n , 



de„ 



dg r , 



d9 m 
u mn ( d 9 



dg n (9 r 



d 



d9 r 



dO r , 
2 



which is always real. Thus (4.3) is satisfied, and consequently Hg is attain- 
able. 
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Chapter 5 



An iterative phase estimation 
algorithm 



5.1 Introduction 

This chapter considers phase estimation, which is of fundamental importance 
to quantum information and quantum computation. Phase estimation is re- 



lated to some very important problems such as estimating eigenvalues (Wei 



and Nori, 2004, Aspuru-Guzik et ai, 2005, Wang et al. 2008, 2009), preci- 



sion measurement of length and optical properties, and clock synchronization 
(de Burgh and Bartlett 2005[ ). (The work in this chapter has been published 



in 



O'Loan (2010) 



Consider a unitary matrix Ug depending on an unknown parameter 9 for 
which one of its eigenvectors \u) is completely known; furthermore Ug acts on 
\u) by Ug\u) = e l27Td \u), where 9 G [0, 1). The task of phase estimation is to 
estimate the eigenvalue e l2ir9 , and consequently 9, as accurately as possible. 
This chapter considers phase estimation of a unitary matrix with known 
eigenvectors, which acts on a 2-dimensional Hilbert space. In particular, 
unitary matrices of the form 



1 

r 





i2n9 



(5.1) 



are considered, where 9 G [0, 1). The angle 9 will be thought of as a point 
on a circle of unit circumference, and confidence intervals for 9 as arcs on a 
circle of unit circumference, known as confidence arcs. The distance between 
the point 9 and an estimate 9, will be defined as 



10-01 



mm 



/mod 1 1 



/mod 1 



(5.2) 
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The performance of phase estimation schemes will be quantified in terms 
of the expected fidelity (F(U§, Uq)). The cost function 



i-(F(m,u„)) = i- 



\tr{U^U e }\ 
I 2 



(5.3) 



the 



will be used, and its asymptotic scaling analysed as a function of n 
number of times that Uq is used. 

For a simple phase estimation approach where Uq is used once on n iden- 
tical copies of some input state (see Section 5.1.1), 1 — (F) = 0(l/n). This 
rate at which 1 — (F) approaches zero is known as the standard quantum 



limit (de Burgh and Bartlett, 2005). 



jiwara 



However, it has been shown (Hayashi 2006a Kahn[ |2007 , Imai and Fu- 



2007) that it is possible to obtain 1 — (F) — 0(l/n 2 ). This rate at 



which 1 — (F) approaches zero is known as the Heisenberg limit (Giovannetti 



et ai, 2004[ ), and cannot be beaten (Kahn, 2007). These methods require n 



copies of Uq and entangled states. 

It has further been shown that it is possible to achieve the Heisenberg 
limit without entanglement, and with only a single copy of Uq (see Section 
5.1.5). Estimation schemes of this type require a rotation gate capable of 



performing arbitrary rotations to perfect precision. 



Kitaev (1996) sketched an iterative phase estimation method which re- 



quires only a single copy of Uq and basic measurements: no extra rotation 
gate is needed. For this method 1 — (F) = 0((logn/n) 2 , which is within a 
logarithmic factor of the Heisenberg limit. However, as will be shown in this 
chapter, attempts to give a detailed account for such a scheme have been 
unsuccessful. This chapter seeks to give a correct detailed phase estimation 



scheme similar to that of Kitaev (1996), which requires only a single copy of 



Uq and basic measurements. 

A selection of different phase estimation schemes will now be given. 



5.1.1 Simple approach 

A very simple method of phase estimation is to let Uq act on the input state 
\ip x ) = 1/V2(|0) + |1)); the output state is \ip e ) = l/v^QO) + e i2wd \l}). 
After measuring in x, outcome is observed with probability p(0; 9) = 
(1 + cos(27r6'))/2. Performing N measurements gives an estimate cos(27r#) = 
2N x= q/N — 1 of cos(2tt6), where N x= o is the number of times outcome 
is observed. After measuring in y, outcome is observed with probability 
p(0; 6) = (1 + sin(27r#))/2. Performing N measurements gives an estimate 
sin(27r#) = 2N y=0 /N — 1 of sin(27r6 l ), where N y=0 is the number of times 
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outcome is observed. From estimates of cos(27r6 l ) and sin(27r6 l ) an estimate 
of 9 can be obtained. 



5.1.2 Kitaev's procedure 



The first /-stage iterative phase estimation procedure was given by Kitaev 



(1996). (The number of stages / is chosen beforehand, and will be a compro- 
mise between the precision desired and experimental resources and limita- 
tions.) At the kth stage of Kitaev's procedure, Ug acts 2 k ~ 1 times on a qubit, 
which is then measured. The experimenter performs some multiple of log(//e) 
measurements of (2 k ~ 1 9) mo< n. This ensures that it is possible to 'localize 
each of the numbers 2 k ~ l 9 in one of the 8 intervals [(s — l)/8, (s + l)/8] (s = 
0, . . . , 7) with error probability < e/V. Using this information, an algorithm 
— which is not given — gives an estimate 9 satisfying 

Pt(J§-1 /2 l+2 , 9 + 1 /2 i+2 ) 3 9\>l-e. (5.4) 
5.1.3 The scheme of Rudolph and Grover 



Rudolph and Grover (2003) looked at the problem of transmitting a reference 
frame from Alice to Bob, which is linked to estimation of an unknown U G 
SU(2), parametrized by three parameters a,9,(f). The scheme of Rudolph 
and Grover involves estimating the parameters a,9,(j) individually using the 
following /-stage iterative procedure. The parameter 9 e [0, 1) is thought 
of in terms of an infinite binary expansion 9 — W\Wi . . . Wi . . . . At the kth 
stage a qubit is sent back and forth between Alice and Bob in such a way 
that, when Bob finally measures it, he observes outcome with probability 
p k (0;9) = (1 + cos(2 fc 7r£))/2. 



This is repeated a minimum of N = 321og 2 (2//e) times (Rudolph and 



Grover, 2003), which ensures that Bob's estimate p k (l; 9) of 9) satisfies 
Pr {{p k - 1/4, p fe + 1/4) 9 p fe ) > 1 - e/l. (5.5) 
It is assumed that if \p k — Pk \ < 1/ 4, then Bob can estimate the kth bit of 9 



correctly. If this is so, then from (5.5), the probability that Bob estimates the 
fcth bit of 9 correctly is at least 1 — e/l, and the probability that he estimates 
all of the binary digits of 9 correctly is at least 1 — e. After / stages, an 
estimate 9 = Wiw 2 ... u); is obtained, satisfying 

Pr( (0-1/2', + 1/2') 3 0) >l-e. (5.6) 

A similar scheme is then used to estimate the parameters a and <fi. The 



method of Rudolph and Grover has been used by de Burgh and Bartlett 



(2005) for the problem of clock synchronization. 
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5.1.4 The procedure of Ji et al. 



Ji et al. (2008) highlighted two errors with the method of Rudolph and 



Grover: 



(i) knowing \9 — 9\\ < l/2 m does not give the first m bits of the binary 
expansion of 9 - consider 9 = 0.49, 9 = 0.5 and m = 1, 



(ii) the method is problematic (in the sense explained in section 5.2) for 9 
close to 1/2. 



Ji et al.\ gave the following Z-stage procedure. In the first stage, the exper- 
imenter lets Uq act on \%p x ) and then measures in x\ outcome is observed 
with probability p(0; 9) = (1 + cos(27r6 l ))/2. The state Ug\ip x ) is measured N 
times (N is some multiple of log(//e)), which gives an estimate 9 satisfying 

Pr ((§ - 1/12, 9 + 1/12^ 3 9) > 1 - e/l. (5.7) 

Having obtained an estimate 9, 

1) if 9 e [0, 5/12), define r x = 2 and v x = 0, 

2) if 9 e [5/12, 7/12), define n = 3 and v x = 1, 

3) if 9 e [7/12, 1], define n = 2 and z/ x = 1. 

At the fcth stage the experimenter lets act r 1 r 2 ...rfc_ 1 times on |^> x ). 
After measuring Ug ir " 2 '" rk ~ 1 \ip x ) N times, (rir 2 . . . rk-i9) mo di is estimated and 
rfc and j/fe are obtained in a similar way to 7*1 and z/i. After Z stages, values 
are obtained for (7*1, . . . ,ri, 1/1, . . . , vi). The final estimate of 9 is 



9 = ^ 



Vi 



(5.8) 



5.1.5 The method of Dobsicek et al 



A popular iterative estimation method is to take 9 to have a binary expan- 
sion of given length I plus some small remainder, that is 9 = W1W2 ■ ■ ■ wi + A. 
The binary digits wi,...,wi are estimated one at a time with a single mea- 



surement. This has been done by Childs et al. (2000), Dobsicek et al. (2007), 



Knill et al. (2007). The method will be reviewed as described by Dobsicek 



et a/.| ( |2007[ ). 

At the kth stage the experimenter lets Uq +1 act on one of two qubits. 
The other qubit is acted on by a Z-rotation gate e iak<Tz before being measured 
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- where ao = and ay- for k = 2, . . . , I depend on the results from the 
previous k— 1 stages. From this measurement, an estimate wi_k+i is obtained 
of the (I — k + l)th binary digit. After I stages an estimate 9 = W\W2 ■ ■ -Wi 
is obtained of 9 which satisfies 



Pr( (9 -l/2 l+1 ,9 + l/2 l+1 ) 3 9) > 0.81. 



1+1 



(5.9) 



The probability that the final interval contains 9 can be increased to 1 — e 
by either (a) increasing the number of rounds to /' = / + log(2 + l/(2e)) 
or (b) using 0(log 2 (l/e)) extra measurements of the first few binary digits 
( |Dobsicek et al. 2007). The method of Dobsicek et al. has recently been 
carried out on experimental data by Liu et al. (2007). Similar work has also 



been done by Higgins et al. (2007) 



5.2 Problems 



There is nothing wrong with Kitaev's method of iterative estimation. How- 
ever, he does not give an algorithm for 

(i) choosing which of the intervals contains (2 fe_1 6>) mo di with probability 
1 - e/l, 

(ii) reconstructing 9 given confidence intervals for (2 fc_1 #) mo di. 



As will be seen in this section, there are gaps in the methods of Rudolph 



and Grover 


and Ji et al. for (i). There are 


of Rudolph and Grover 


which will now be 



plained. Firstly, pk(0;9) = 
(1 + cos(2 fc vr#))/2 is a multimodal function of 9. For example, 9 = 3/4 and 
9 = 1/4 give the same value of pi(0; 9), even though they differ in the first 
binary digit. To overcome this, an estimate of sin(27r#) is needed as well. 
This however is a trivial point and is easily overcome. 

Secondly, if 9 = 1/2 ±5, where S is small, a large number of measurements 
is required to determine the first bit of 9 correctly with high probability. If 
a mistake is made then, for the final estimate 9, \0 — 9\i > 5. This problem, 
which occurs for 9 close to 1/2, was pointed out by Ji et al. (2008). 

A similar problem also occurs for 9 = ± 5. Because of this, difficulties 
will be encountered in estimating the fcth bit of 9 whenever (2 fc-1 #) mo di ~ 0, 
(2 fc_1 0) mo di ~ 1 or (2 fc-1 #) mo di ~ 1/2. However, it may also be possible to 
overcome this problem using extra rotation gates in these cases. 

There are also gaps in the method of Ji et al. (2008). Firstly, like Rudolph 
and Grover, they overlook the fact that pi(0; 9) = (l+cos(27r#))/2 is bimodal. 
Secondly, the accuracy of their final estimate relies on the assumption that 
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if \9 - 0|i < 1/12 and 9 G [0,5/12), then 9 G [0, 1/2). This is not true - 
consider 9 = -1/12 £ [0, 1/2). Similarly, they assume that if \0 - 9\t < 1/12 
and 9 G [7/12,1), then 9 G [1/2,1), which again is not true - consider 
9 = 1/12 sjL [1/2, 1). Again problems will be encountered at the fcth stage if 
(n • • -rfe-i^modi ft or (n • • -rfe-i^modi ~ 1- 



5.3 An iterative estimation algorithm 

This section contains a new method of phase estimation. Firstly, an iterative 
algorithm is given for going from confidence arcs for 9, (20) mo di, (40) m odi, • • • , 
(2' _1 0) mo di, of length 1/3 and coverage probability at least 1 — e/l, to a 
confidence arc for 9 of length 1/(2 Z_1 x 3) and coverage probability at least 1 — 
e. Secondly, a method is given for obtaining a confidence arc for (2 0) mo di> 
of length 1/3 and coverage probability at least 1 — e/l. Thirdly, one of 
Bernstein's inequalities is used to calculate the number of measurements 
needed at each stage. Finally, it is shown that it is possible to choose a value 
of e such that 1 - (F(U 6 , U e )) = 0((\ogn/n) 2 ). 



5.3.1 The iterative algorithm 

First an intuitive approach is given using examples. For computational sim- 
plicity, confidence arcs of length 0.3 and coverage probability 1 will be con- 
sidered. Lk and Jk will denote confidence arcs for (2 fc-1 #) mo di and 2 k ~ 1 9 
respectively, of length 0.3 and coverage probability 1. (In the more general 
algorithm and Jk will have length 1/3 and coverage probability at least 
1 — e/l.) For the examples, I = 3. 



Example 1 

Suppose that after doing some measurements of U$, Ujj and Ug it is found 
that 



U = [0.6,0.9] 3 9 

L 2 = [0.3,0.6] 3 (20) mod i 

L 3 = [0.8,1.1] 3 (40) modl . 



It follows from (5.10) that 



2L X = [1.2,1.8] 3 29. 



Using (5.11) and (5.13), it follows that 

J 2 = [1.3,1.6] 3 29. 



(5.10) 
(5.11) 
(5.12) 



(5.13) 
(5.14) 
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From (5.14) it is known that 



2J 2 = [2.6,3.2] 3 49. 



(5.15) 



Using (5.12) and (5.15) gives 



J 3 = [2.8,3.1] 3 49. 



(5.16) 



Using confidence arcs (5.10), (5.11) and (5.12) for 9, (29) mo di and (49) mo d 



respectively, of length 0.3 and coverage probability 1, a confidence arc (5.16) 



has been derived for 49 of length 0.3 and coverage probability 1. This gives 
a confidence arc for 9 of length 0.3/2 3 " 1 = 0.075 and coverage probability 1, 
namely 

(1/4) J 3 = [0.7,0.775] 3 9. (5.17) 

Remember that confidence arcs on a circle are being considered. On the 
circle the arc [1.2, 1.8] is equivalent to the arc [0.2, 0.8], as are [2.2, 2.8], [3.2, 3.8] 
. . . . Similarly, [2.6,3.2] is equivalent to [0.6, 1.2]. 

The symbol Ci will be used to signify that a confidence arc on the circle 
is a subset of another confidence arc on the circle. Similarly, the symbol 
Gi will be used to signify that a point is contained within an arc on the 
circle, e.g. 0.3 Gi [1.2, 1.8]. The previous example was rather simple in that 
[0.3,0.6] Ci [1.2,1.8] and [0.8,1.1] d [2.6,3.2]. 

Consider the following example for which, Lfc+i (£i 2J&. (Note that 

u = Ji.) 

Example 2 

Suppose that after doing some measurements of Ug, Ujj and Ug it is found 
that 



Li 
U 



It follows from (5.18) that 



[0.1,0.4] 3 9 
[0.7,1.0] 3 (29) modl 
[0.9,1.2] 3 (49) modl . 



2Ji = [0.2,0.8] 3 29. 



Now L 2 t\ 2Ji. From (5.19) and (5.21) it follows that 



[0.7,0.8] 3 29. 



(5.18) 
(5.19) 
(5.20) 



(5.21) 



(5.22) 



However, for simplicity, the confidence arcs Jk will be kept of equal length 
(in this example 0.3, in the more general algorithm 1/3). There is no unique 
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way to do this. A convenient way is to keep Jk C 2Jk-\ and Jk of length 0.3. 
Thus for this example the upper bound for 29 remains as 0.8 and the lower 
bound is chosen to be 0.8 — 0.3 = 0.5. This gives 

J 2 = [0.5,0.8] 3 29. (5.23) 



From (5.23) it follows that 

2 J 2 = [1.0, 1.6] 3 40. (5.24) 

Now, again L 3 (£ 1 2J 2 . To keep Jk Ci 2J k _i and Jk of length 0.3, the lower 
bound remains as 1.0 and the upper bound becomes 1.0 + 0.3 = 1.3, 

J 3 = [1.0, 1.3] 3 AO. (5.25) 

A confidence arc for 2 3 ~ 1 8 has been found of length 0.3 and coverage prob- 
ability 1. This gives a confidence arc for 9 of length 0.3/2 3-1 = 0.075 and 
coverage probability 1, namely 

(1/4) J 3 = [0.25, 0.325] 3 9. (5.26) 

General Algorithm 

The general algorithm will now be presented. Confidence arcs are now of 
length 1/3 rather than 0.3, and coverage probability 1, 

L k = [x(k),x(k) + l/3], z(A;)e[0,l) (5.27) 
J k = [z{k),z{k) + l/3]. (5.28) 

As in the examples, 2Jk and Lk+i are used to find a confidence arc Jk+i, with 
Jfc+i C 2J fc . For Jk+i C 2J k it is required that z(k + l) G [2z(k),2z(k) + 1/3}. 
Assuming that Jk 3 2 h ~ l 9 and Lk+\ 3 (2 fc 6') mo di, there are three possibilities. 
For each possibility a figure is given (showing, for simplicity, a line instead 
of an arc), with a small vertical line representing the choice of the lower 
boundary z(k + 1) of Jk+i- Note that J\ = L 1 . 



mm 2j k 

Figure 5.1: Situation (i) 
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I 1 

Figure 5.2: Situation (ii) 
j I 1 

Figure 5.3: Situation (iii) 

(i) The simplest possibility is that Lk+\ Ci 2J^. This occurs when 

(x(k + I) - 2z(k)) moA1 G [0,1/3). 
In this case the lower boundary of Jk+i is taken to be 

z(k + 1) = 2z(k) + (x(k + 1) - 2z(k)) modl . 

(ii) Another possibility is that x(k + 1) G\ 2J& but x(k + 1) + 1/3 Gi 2Jfc. 
This occurs when 

(x(k + 1) - 2z(k)) moA1 G [2/3,1). 
In this case the lower boundary of Jk+i is taken to be 

z(k + l) = 2z(k). 

(iii) The final possibility is that x(k + 1) Gi 2J^ but x(k + 1) + 1/3 G^ 2J&. 
This occurs when 

(x(k + l)-2z(k)) modl G [1/3,2/3). 
In this case the lower boundary of J^+i is taken to be 

z(k + l) = 2z{k) + -. 

This iterative scheme gives the confidence arc Ji = [z(l),z(l) + 1/3] for 
2 l ~ 1 9 of length 1/3 and coverage probability 1. This gives a confidence arc 
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for 9 of length 1/(2' 1 x 3), and coverage probability 1, namely (1/2' X )J; = 
[z(Z)/2 ,_1 , (z(l) + l/3}/2 ,_1 ]. The centre of this interval, modulo 1, is taken 
as the final estimate 9 of 9, i.e. 



The final confidence arc for 9 of length 1/(2' _1 x 3) contains 9 if Lk 3 
(2 fe-1 0) mo di, for every k — 1, . . . , I. If, for every k = 1, ...,/, Lk has coverage 
probability at least 1 — e/l, the coverage probability of the final confidence 
arc is at least 1 — e. 

5.3.2 Finding L k 

The following function will be used: 



Here, details are given for calculating confidence arcs Lk for (2 fe_1 0) mo di of 
length 1/3 and coverage probability at least 1 — e/l. First it will be shown how 
to compute a confidence arc of length 1/3, then, how to make the coverage 
probability at least 1 — e/l. 

The problem of finding a confidence arc for 9 will be considered. The 
analysis is exactly the same as for (2 fc_1 ^) mo di, except that in the latter case 
the experimenter lets Ug act 2 fe_1 times on the same \ip x ). 

The experimenter lets Ug act on \ip x ) and then measures in x. Outcome 
is observed with probability p x (0; 9) = (1 + cos(2-7r#))/2. The state Ug\ip x ) 
is measured in x a total of N times and outcome is observed N x=0 times. 
This gives an estimate 2N X=0 /N — 1 of cos(27r#). 

The experimenter lets Ug act on \ip x ) and measures in y. Outcome is 
observed with probability p y (0;9) = (1 + sin(27r#))/2. The state Ug\ip x ) is 
measured in y a total of iV times and outcome is observed N y=0 times. 
This gives an estimate 2N y=0 /N — 1 of sin(27r^). Estimates of sm(2ir9) and 
cos(27r#) give the estimate 




arctan(y/a;) x > 0, 



atan2(x, y) = < 



V 



arctan(y/a;) + n x < 0, y > 0, 

arctan(y/a;) — n x < 0, y < 0, 

tt/2 x = 0, y > 0, 

-tt/2 x = 0, y < 0, 

undefined x — 0, y — 0. 




(5.29) 
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of 9. The confidence arc is 

L x = ((0! - f/6) modl , (<9i - l/6) modl + 1/3). (5.30) 

More generally, an estimate (2 fe_1 6' fc ) mod i of (2 fc_1 0) modl gives the confidence 
arc 

L k =( ({2 k -%) moA1 - 1/6) , ((2 k - 1 9 k ) modl - 1/6) + 1/3 ] . 

V V / mod 1 V / mod 1 J 

(5.31) 

It is necessary to find the accuracy needed for the estimates of p x {0', 0) and 
p y (0; 0) to ensure that \6 — 9\\ < 1/6, and hence L\ 3 9. 

Put x = cos(2tt0), y = sin(27T0), x = 2N X=0 /N - 1, y = 2N y=0 /N - 1 
and 4>(x,y) = atan(?/,x). Define 

\4> - <P\2tt = Him - 0) m od2vr, (0 - 0)mod27r) • (5.32) 

Given that 

\x — Xq\ < a, (5.33) 

\y-Vo\ < at, (5.34) 

an upper bound is sought on \<f)(x,y) — <fi(xo, jfo)^- This will be done in 

steps. 

(i) 

\(j)(x,y) - (j>(x ,yo)\ 2 Tr = \[(j)(x,y) - (p(x,y )] 

+ [<P(x,Vo) ~ 0(^O,Z/o)]|27r 

< \4>(x,v)-<l>(x,Vo)\2* 

+ \<j>(x,y ) - (f){x ,y )\27r- (5.35) 

Put 

^i = \<l>{x,y) - 0(x,yo)\2w 
^2 = \<P(x,yo) - 4>(x ,y )\27T- 

(ii) Consider the triangle T\ given by the points (0, 0), (x, y) and (x, yo), with 



yo satisfying (5.34). The angle at the point (0, 0) is ipi, and is opposite a side 
of length \y — y \. The angle, say ipA, at the point (x,yo) will be opposite a 
side of length 1. Using the sine rule for T x gives 

sinV>i sin^A Q ^ 
1 1 = — , — • 5 - 36 

\y-yo\ i 
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For any triangle the angles add up to tt. The largest angle will be opposite 
the longest side. For any angle, ip* say, not opposite the longest side, ip* G 
[0,7r/2]. If a < 1/2 then from ( [5734] ) \y - y \ < 1 and so ipt G 
Thus ip\ <-> sin^i, and hence = arcsin/3, with /3 = |y — 

G [0, 7r], consequently sin^ G [0,1], and using (5.34) it follows that 



[0,tt/2]. 
x sin-^A- 



As ip A 

(3 G [0, a]. Since arcsin is a monotone function on [0, a], it follows that 



ipi < arcsin(a) 



(5.37) 



(iii) Consider the triangle T 2 given by the points (0,0), (x,yo) and (xo,yo) 



with xo and yo satisfying (5.33) and (5.34) respectively. The angle at the 
point (0, 0) is ip2 an d is opposite a side of length \x — x \. The angle, say ipB, 
at the point (x ,y ) is opposite a side of length r, where 



x 2 +yl 



> min ^Jx 2 + (y + A) 2 , A G [-a, a] 



mm a/x 2 + y 2 + 2yA + A 2 



min x/l + 2uA + A 2 

A 



> 



mm 

A 



mm 

A 



'I - 2|A| + |A| 2 
1 - IAI 



1 



a. 



(5.38) 



Using the sine rule for T 2 gives 

sin ip2 



sin ip 



B 



\X 



Xq\ 



(5.39) 

If a < 1/2 then a < 1 — a and so from (5.33) and (5.38), \x — xq\ < r. It 



follows that ip 2 G [0,tt/2] and so ip 2 «-» sin^ 2 - Using ( |5\33| ), fl5T38[ ), ( |09| ) 
and monoticity of arcsin on [0, 1] gives 



ip2 < arcsin 



a 



a 



(5.40) 



Theorem 5.1 Given (5.33) and (5.34) for a < 1/2, 



(x, y) — 4>(x , yo)\2n < arcsin(a) + arcsin 



a 



1 — a 



(5.41) 
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Proof. This follows from ((5735 ),( 5737b and dB740b. 



For the iterative algorithm it is required that \9 — 9\\ < 1/6, which is equiv- 
alent to 



(x,y) - <fi(x ,yo)\ 2 K < 
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(5.42) 



If a = 0.3794 then (5.42) holds, and (5.33) and (5.34) are equivalent to 



\N x=0 /N-p x {0;8)\ < 0.1897, 
\N y=0 /N- Py (0;6)\ < 0.1897. 



It follows that if 



Pr(\N x=o /N-p x {O;0)\ < 0.1897^) > y/l - 7fl 



and 



then 



Pr \N y=0 /N-p y (0;6)\ < 0.1897 > y/l-e/l 



Pr {L x 3 9) > 1 - e/l. 



(5.43) 
(5.44) 



(5.45) 



(5.46) 



(5.47) 



An analogous result holds for k = 2, . . . , I. In Section 5.3.3 it is shown 



that if N — 24.437 log(4//e) then (5.45) and (5.46) hold. 



5.3.3 Number of measurements needed 



The following Bernstein inequality (Hazewinkel, 2002) will be used: 
Theorem 5.2 If the equations 



E[Y J } = 0, E[Y J 2 ]=b J , j = l,... 



n. 



hold for the independent random variables Yx, ■ ■ ■ , Y n with 

E[\Y \ l ] < \H l -H\ 



(5.48) 



(where I > 2 and H is a constant independent of j), then the following 
inequality holds for the sura S n = YTj=i 



Pr(|5„| > r) < 2exp 



2(B n + Hr) 



(5.49) 



where B n = Y!j=i bj- 
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The observed measurement outcomes from a single measurement in x 
have distribution and moments 

X,.~Bin(l,p), E[X j ]=p, E[Xf]=p(l-p), 

where p = (1 + cos(27r6 l ))/2. Put 6j = p(l — p) for j = 1, . . . , N and consider 
the random variable Rj = Xj — p, which has moments 



Now, for I > 2, 



E[R j \ = ^ E[R 2 ] = p(l - p) = b 3 



E[\R 3 \ 1 } = p|i-p|i + (i-p)|0-p| 
< p(l - p) 2 + (1 - p)p 2 
= p(l-p) 



(5.50) 



Thus, comparing (5.50) with (5.48), H = 1 is a suitable choice. Substit uting 
£jv = £f =1 &j = Np(l - p) and S N = £ji ^ = ^ =0 - iV> into ( jgg 
gives 



PrdiV^i - Np\ > r) < 2exp 
Putting r = iV<5, gives 

Pt(\N x=1 /N - p\ > 5) < 2exp 

< 2exp 



2(Np(l-p)+r) 



N6 2 
'2(p(l-p) + 5) 
N5 2 



2(1/4 + 5) 



(5.51) 



The inequality Pr(\N x=0 /N—p\ < 5) > y 1 — e/Z, is equivalent to the inequal- 



ity Pt(\N x=q /N -p\ > 8) < 1 - VI - e/Z, whic h holds if Pr(|jV x=0 /A^ -p| > 
5) < e/(2Z). Substituting 5 = 0.1897 into Q5.51[ ), it can be found that ( |5.45[ ) 
holds if 

N = 24.437 ln(4Z/e) (5.52) 

measurements in x are performed at each stage. The analysis is exactly the 
same for measurements in y, and so a total number of 



N tot = 48.874 ln(4Z/e) 



(5.53) 



measurements are required at each stage. This ensures that (5.45) and (5.46) 
hold, and consequently (5.47) holds. 
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5.3.4 The behaviour of the fidelity 



The behaviour of 1 — (F(U§, Ug)) will be analysed as a function of the number 
n of times Ug is used. As in Rudolph and Grover (2003), the worst-case value 
of 1 — (F(U§, Ug)) will be sought. That is, if the final confidence arc does not 
contain 9 then 9 = (9 + l/2) modl , and if it does then 6 lies on the boundary 
of the confidence arc, i.e. \9 — 6\i = 1/(2' x 3). This gives 



\ -(F(U d ,Ug)) < l-((l-e) 

7T 2 



1 + cos(2tt/(2' x 3)) 



+ e x 



67T 



2 21 x 9 2 2i x9' 

If e = 1/2 2 ', then 1 - (F(U § , Ug)) = 0(l/2 2i ). This requires a total of 



N tot = 48.874 log(4Z x 2 



21 \ 



(5.54) 



measurements at each stage. The number of times Ug is used is n = N tot (2 l — 
1), and so 1/2' m N tot /n. The number of measurements, (5.54), made at each 
stage is 0(1); noticing that logra is also 0(1), it follows that 



l-(F(U § ,Ug))=0 



log n s 



n 



(5.55) 



5.4 Simulations 

The analysis in Section |5.3.4| concentrated on optimizing the worst-case 
asymptotic scaling of 1 — (F) with respect to n. Here a more pragmatic 
line will be taken. Of interest is the minimum number of measurements 
needed such that the final confidence arc contains 9 a satisfactory proportion 
of the time. 

The iterative algorithm will now be investigated using simulations with 
the computer package MAPLE. A value for the parameter 9 G [0, 1) is given 
by a random variable with a uniform distribution. Measurement results can 
be simulated, since the number of times outcome is observed has a Binomial 
distribution. For example, at the fcth iterative stage, measuring in x, N x= q ~ 
Bin(Af, (1 + cos(2 fe 7r#))/2). From the simulated results of measurements in 
x and y for stages 1, . . . , I, an estimate of 9 is obtained using the iterative 



algorithm given in section 5.3.1, It can then be checked whether the final 



confidence arc contains 9. This is done for 100, 000 randomly chosen values 
of 9, and the number of times the final interval contains 9 is recorded. 
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For most recent iterative schemes the total number of iterations is reason- 



ably small: 6 in Higgins et al. (2007) and 7 in Liu et al. (2007). Simulations 
were performed with the number of iterations varying between 6 and 9. Ta- 



ble 5.1| gives the number of times the final confidence arc contains the true 
value of 9. 





Number of iterative stag 


;es (/) 


Ntot 


6 


7 


8 


9 


20 


99,792 


99,729 


99,747 


99,712 


30 


99,993 


99,987 


99,982 


99,978 


40 


99,999 


100,000 


99,998 


99,999 


50 


100,000 


100,000 


99,999 


100,000 



Table 5.1: Numbers of trials out of 100,000 with \§ - 9\ < l/{2 1 x 3). 



It seems a waste to use N to t = 48.874 log(2/ x 2 ) measurements at each 
stage, since the simulations suggest that for practical purposes it is sufficient 
to use fewer measurements - even as few as 20 or 30. 



5.4.1 Estimating the coverage probability 

Using the above simulations the coverage probability can be estimated, i.e. 
the probability that, using the iterative algorithm, the known true value 9 is 
contained in the final confidence interval. 

Suppose the true (unknown) coverage probability is p. For the ith trial 

put 

Wi = 1 if interval covers 9 
= if not. 

Then Wi, . . . , Wm are independent identically distributed Bernoulli random 
variables, i.e. W% ~ Bin(l,p). Thus 

Wi H h W M ~ Bin(M,p). 

If m out of M intervals cover 9 then p is estimated by m/M. An approximate 
95% confidence interval for p is 



m I -i m \ 

M \ l Ml 



m 

M V M 
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The longest confidence interval (0.00066) is that for using 9 iterative stages 
and a total of 20 measurements at each stage. Using the half-length of this 
confidence interval, the confidence interval 

m 

± 0.00033 



100,000 



can be computed from the results given in Table |5.1| It has coverage proba- 
bility at least 95%. 



5.5 The noisy case 

It is known that when even a small amount of noise is present the performance 



of phase estimation schemes is greatly reduced (Huelga et al. , 1997, Shaji and 



Cavesj |2007D - 

This section investigates the performance of the iterative estimation al- 
gorithm when depolarizing noise is present. The channel 



Po i-> (1 



r)U 0Po U} + -h, 



< r < 1, 



(5.56) 



is considered, wher e Ug is the same as before, (5.1), and po = \ipx)(ipx\- 



(Nie 



(The channel (5.56) is identical to Ugp§U\ undergoing phase damping with 
r(2 



, 2000 


p. 383).) 


Ji et al. 


( 


2008) 



A 

the very interesting result that if r > 0, then the optimal asymptotic rate 
at which 1 — (F(U§, Ug)) approaches zero is given by the standard quantum 
limit. 

Defining n! as the maxi mum number of times the experimenter lets Ug 
act on the same input state, Ji et al. (2008) argued that if (1 — r) n is close 



to 1, and thus n'r « 1, then it is still possible to estimate as before with 
the rate 0((logn/n) 2 ). 

The whole point of using an iterative scheme is that the distinguishability 
of from cos(n27r0), with n » 1, is considerably greater than from cos(27T0). 

To measure distinguishability, the quantity F e M jm will be used, where m 
is the number of times Ug acts on the same input state. This is because of 
interest is to maximize the distinguishability of per use of the channel. 

If there is no noise, and the experimenter lets Ug act m times on the 
input state and measures in x, then outcome is observed with probability 
p(O;0) = (l + cos(m2vr0))/2 and 1 with probability 9) = l-p(O;0). The 
Fisher information from this measurement is F g Mx = 4ir 2 m 2 , which is equal 
to the SLD quantum information. Measuring in y gives the same Fisher 
information. Thus Fq Ix /m = F e My jm = ATT 2 m. At the kth stage of the 



8N 



iterative procedure, the experimenter lets Ug act m = 2 k ~ 1 times on the 
input state, and so F e Mx jm = F e My /m = ir 2 2 k+1 . Thus Fg 1 /m (where M is 
an arbitrary measurement in x or y) increases exponentially with k. 

In the noisy case, when the experimenter lets Ug act m times on the 
output state and then measures in x, outcome is observed with probability 
p(0;6) = (1 + (1 - r) m cos(m27r#))/2 and 1 with probability p(l;0) = 1 - 
p(0;9). Measuring in y, outcome is observed with probability p(0; 6) = 
(1 + (1 -r) m sin(m27r#))/2 and 1 with probability p(l; 6) = 1 -p(O;0). This 
gives 



Notice that 



4vr 2 m 2 (l - rf m sin 2 (2m7r#) 
1 - (1 -r) 2m cos 2 (2m7r#) 
4vr 2 m 2 (l - r) 2m cos 2 (2m7T0) 
1 - (1 - r) 2m sin 2 (2m7T0) 
H e = 47T 2 m 2 (l - r) 2m 



F?* + F^ w iJ e 



Thus measuring both in x and ?/, the average Fisher information from a single 
measurement M is approximately Hg/2. 

The maximal value of F e M /m, taken over m, will occur close to the maxi- 
mal value of Hg/m. When r > 0, Hg/m, and hence Fg /m, does not increase 
indefinitely with m. Instead it reaches its maximum at 

m = — -, r, (5.57) 

21og(l-r)' V ; 

after which it decreases. When r is small, this maximum is obtained at 

m w — . (5.58) 
2r v ; 

Thus in the noisy case the number of iterative stages that should be per- 
formed is limited by the amount of noise. The number of stages that can be 
performed, for small r, such that Hg/m, and hence F g M /m, increases at each 
stage is approximately I ~ — log 2 r. 



Figures 5.4 - 5.8 give Hg/m at the kth iterative stage. It can be seen that 
Hg/m increases up to k = — log 2 r, decreases slightly near k = — log 2 r + 1 
and falls rapidly for k > — log 2 r + 1 . 



Tables [572] [577] contain the results of simulations, for magnitudes of noise 
r = 2~ 4 , 2 -5 , . . . , 2~ 8 and total number of iterative stages I = 4, . . . , 9 - the 
number of measurements at each stage is fixed. Consider the diagonals of 
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Tables 52 - 5.7 from r = 2 -4 , I = 4 to r = 2 -8 , I = 8. This corresponds 
to the experimenter performing I = — log 2 r iterative stages, which involves 
going up to the iterative stage at which F^ 1 jm is maximized. Similarly, 
the diagonal from r = 2 -4 , I — 5 to r = 2~ 8 , / = 9 corresponds to the 
experimenter performing I = — log 2 r + 1 iterative stages etc. It is interesting 
to note that when I > — log 2 r, there is a significant decrease in the number 
of confidence intervals containing 9, and when I > — log 2 r + 1, an even 
greater decrease in the number of confidence intervals containing 9. For 
example, using 30 measurements at each stage, if the experimenter performs 
I = — log 2 r iterative stages then the final confidence interval contains 9 
approximately 98% of the time; if the experimenter increases to / = — log 2 r + 
1 iterative stages, then the final confidence interval contains the true value 
of 9 approximately 89% of the time. If the experimenter increases to I = 
— log 2 r + 2 iterative stages, then approximately 61% of the time the final 
confidence interval contains 9 - a considerable drop in performance. It can 
be seen from Table 5/T for which 200 measurements are performed at each 
stage, that this drop in performance does not just occur when performing 
relatively small numbers of measurements at each stage. 

It is interesting to see that the drop off in performance, in terms of the 
coverage probability - which can be calculated from Tables |S~2 - |5.7 



occurs 



at the same point as the drop in performance as measured by Hg/m, and 



consequently Fg/m - seen in Figures 5.4 - 5.8 

Since Fq 1 /m starts to decrease after I = — log 2 r iterative stages, it makes 
no sense to choose I > — log 2 r. The simulations also suggest that it is safer 
to do no more than I = — log 2 r iterative stages. This is equivalent to letting 
Ue act no more than n! = l/(2r) times on the same input state. Thus for a 
given level of noise the experimenter can let Ue act on an input state more 
times than n' satisfying n'r << 1 (though the 0((logn/n) 2 ) rate may not be 
kept). A sensible suggestion is, more generally, that for the channel (5.56) 
the optimum number of iterative stages, where at the kth stage Ug is used 
2 fc_1 times, is Z = |_ — log 2 rj . 

A related question was considered in Rubin and Kaushik (2007), where 
the 'stopping point', was iV the number of entangled photons to be included 
in the NOON input states. Rubin and Kaushik found that the optimal pre- 
cision in measurement occurred for iV = 1.279/L, where L is the magnitude 
of loss (analogous to the point, n! = l/(2r), at which F^ jm is maximized). 

If I = — log 2 r iterative stages are performed and the final confidence 
interval contains 9, this corresponds to a precision \9 — 9\i < r/3. If the 
experimenter desires greater precision in his final estimate than \9 — 9\i < r/3, 
then it seems sensible for him to perform more measurements at the final 
iterative stage. 
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H/rn as a function of k 




H/m as a function of k 




4 6 8 10 

k 

Figure 5.6: Hg/m at the kth iterative stage, with r = 2 

H/m as a function of k 




4 6 8 10 

k 

Figure 5.7: Hg/m at the kth iterative stage, with r = 2 
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H/m as a function of k 



1600 



1200 



BOO 



400 




Figure 5.8: He/m at the fcth iterative stage, with r = T 



Ni 



tot 







Numb 


er of iterative sta 


ges (/) 




r 


4 


5 


6 


7 


8 


9 


2 -4 


94,601 


78,896 


48,831 


24,429 


11,514 


5,430 


2- 5 


98,804 


94,854 


79,207 


49,625 


24,891 


11,738 


2 -6 


99,608 


98,728 


94,840 


79,428 


50,121 


24,887 




99,779 


99,571 


98,768 


94,917 


79,544 


50,130 




99,823 


99,719 


99,571 


98,764 


94,715 


79,745 


: Numbers of trials out of 100,000 with 


\0- 0\i < 1/(2' x 






Numb 


er of iterative sta 


ges (I) 




r 


4 


5 


6 


7 


8 


9 


2-4 


98,290 


88,340 


60,423 


32,445 


16,059 


8,042 


2- 5 


99,804 


98,408 


88,537 


61,293 


32,756 


16,460 


2- 6 


99,967 


99,807 


98,430 


88,708 


61,148 


32,595 


2- 7 


99,985 


99,955 


99,802 


98,476 


88,895 


61,699 


2- 8 


99,988 


99,977 


99,962 


99,812 


98,467 


88,864 



Table 5.3: Numbers of trials out of 100,000 with \9 
N tot = 30. 



0|i < 1/(2' x 3), with 
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r 




Number of iterative sta 


ges (/) 




4 


5 


6 


7 


8 


9 


2 -4 


99,336 


91,501 


63,433 


33,429 


16,130 


7,940 


2~ 5 


99,972 


99,349 


91,753 


64,469 


34,079 


16,262 




99,999 


99,962 


99,391 


92,139 


64,768 


33,861 




99,999 


99,993 


99,960 


99,388 


92,190 


64,744 


2- 8 


99,998 


99,999 


99,997 


99,957 


99,371 


92,287 



Table 5.4: Numbers of trials out of 100,000 with \9 - 9^ < 1/(2' x 3), with 
N tot = 40. 



r 




Number of iterative stag 


es {I) 




4 


5 


6 


7 


8 


9 


2-4 


99,741 


94,475 


68,789 


37,529 


18,793 


9,308 


2" 5 


99,991 


99,738 


94,644 


69,626 


37,976 


19,139 


2- 6 


99,999 


99,993 


99,759 


94,909 


70,021 


38,232 


2- 7 


100,000 


99,998 


99,995 


99,770 


95,030 


70,402 


2-8 


100,000 


100,000 


99,999 


99,995 


99,790 


94,983 



Table 5.5: Numbers of trials out of 100,000 with \§ - 0|i < 1/(2' x 3), with 
N tot = 50. 



r 




Numb 


er of iterative stages (I) 




4 


5 


6 


7 


8 


9 


2-4 


99,993 


98,780 


78,515 


43,641 


21,599 


10,983 


2- 5 


100,000 


99,994 


98,924 


79,739 


44,374 


22,150 


2-6 


100,000 


100,000 


99,999 


98,904 


79,899 


44,762 


2- 7 


100,000 


100,000 


100,000 


99,997 


98,966 


80,004 


2- 8 


100,000 


100,000 


100,000 


100,000 


99,998 


98,989 



Table 5.6: Numbers of trials out of 100,000 with \§ - 6\ x < 1/(2' x 3), with 
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r 


Number of iterative stages (I) 


4 


5 


6 


7 


8 


9 


2 -4 


100,000 


99,907 


87,516 


50,576 


25,592 


12,868 


2- 5 


100,000 


100,000 


99,937 


88,393 


51,414 


26,064 


2- 6 


100,000 


100,000 


100,000 


99,946 


88,754 


52,022 


2- 7 


100,000 


100,000 


100,000 


100,000 


99,953 


88,932 


2 -8 


100,000 


100,000 


100,000 


100,000 


100,000 


99,938 



Table 5.7: Numbers of trials out of 100,000 with \§ - 6^ < l/(2 l x 3), with 
N tot = 200. 
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Appendix A 
Notation 



notation 



definition 



\4>)(1>\ 

10) 

II) 

z* 

P 
I 

h 
M 



finite dimensional complex column vector of unit length (see (1.1)) 
dual of \t/j) (see (O). 



inner product of and \4>) (see (1.3 )). 
outer product of \ip) and \<fi) (see (1.6)). 
(1,0) T (T denotes transpose). 
(0,1) T 

complex conjugate of z. 



density matrix (see Section 1.3). 

identity matrix. 

d x d identity matrix . 

Pauli matrices (1.15). 

Pauli matrices. 

POVM (see Section O. 
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notation 



definition 



) 



M m 

n 

Ha,b 
At 

\<P A <P B 

S{H) 

Pa 

Ek 

U 

£ 

F M 



(F) 
Pe 

^SLD, A, \fy 

C E (9) 

r k 

Cr(9) 

Cl(0) 
Hi, H e 
F(U,U) 



element of POVM M corresponding to outcome m. 
Hilbert space. 

Ha ® H.B an extended Hilbert space. 
Hermitian transpose of A (see Section 
\<p A )®\<p B ). 

set of states on the Hilbert space H. 
reduced state on S(Ha) (see Section 



1.2) 



1.5.1) 



Kraus operator (see Section 1.7). 
unitary matrix (any matrix sa tisfy ing UW 
quantum channel (see Section 



1.7). 



Fisher information (See Section 1.38). 



WU = 1). 



Fisher information from single measurement using M (See section 1.38) 
converges in distribution to. 
expectation of F. 
parameterized family of s tates 
SLD quantum score 



))• 



Dased o n ar bitrary 
{E k } 



sec 



2.3). 



(see (jl_52 

Sarovar and Milburn's bound 
set of Kraus operators E 
canonical Kraus operator (see before (2.4)). 
Sarovar and Milburn's bound based on 
canonical Kraus operators { Tfc}. 
metric derived from Cx (see (3.15)). 



SLD quantum information (see Section 
fidelity between U and U (see (1.90)). 



1.9 
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